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ABSTRACT 

The  problem  of  computing  reliability  and  availability  and  their  associated 
confidence  limits  for  multi-component  systems  has  appeared  often  in  the  litera¬ 
ture.  This  problem  arises  where  some  or  all  of  the  component  reliabilities  and 
availabilities  are  statistical  estimates  (random  variables)  from  test  and  other 
data.  The  problem  of  computing  confidence  limits  has  generally  been  con¬ 
sidered  difficult  and  treated  only  on  a  case-by-case  basis.  This  paper  deals  with 
Bayes  confidence  limits  on  reliability  and  availability  for  a  more  general  class  of 
systems  than  previously  considered  including,  as  special  cases,  series-parallel 
and  standby  systems  applications.  The  posterior  distributions  obtained  are  ex¬ 
act  in  theory  and  their  numerical  evaluation  is  limited  only  by  computing 
resources,  data  representation  and  round-off  in  calculations.  This  paper  collects 
and  generalizes  previous  results  of  the  authors  and  others. 

The  methods  presented  in  this  paper  apply  both  to  reliability  and  availability 
analysis.  The  conceptual  development  requires  only  that  system  reliability  or 
availability  be  probabilities  defined  in  terms  acceptable  for  a  particular  applica¬ 
tion.  The  emphasis  is  on  Bayes  Analysis  and  the  determination  of  the  posterior 
distribution  functions.  Having  these,  the  calculation  of  point  estimates  and 
confidence  limits  is  routine. 

This  paper  includes  several  examples  of  estimating  system  reliability  and 
confidence  limits  based  on  observed  component  test  data.  Also  included  is  an 
example  of  the  numerical  procedure  for  computing  Bayes  confidence  limits  for 
the  reliability  of  a  system  consisting  of  A'  failure  independent  components  con¬ 
nected  in  series.  Both  an  exact  and  a  new  approximate  numerical  procedure  for 
computing  point  and  interval  estimates  of  reliability  are  presented.  A  compari¬ 
son  is  made  of  the  results  obtained  from  the  two  procedures.  It  is  shown  that 
the  approximation  is  entirely  sufficient  for  most  reliability  engineering  analysis. 


l 
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INTRODUCTION 

The  problem  of  computing  reliability,  availability,  and  confidence  limits  for  multicom-  • 

ponent  systems  where  some  or  all  of  the  component  reliabilities  and  availabilities  are  statistical  : 

estimates  from  test  and  other  data  has  appeared  often  in  the  literature.  The  problem  of  com¬ 
puting  these  confidence  limits  has  generally  been  considered  difficult  and  treated  only  on  a  case 
by  case  basis.  The  present  paper  deals  with  Bayes  confidence  limits  on  reliability  and  steady 
state  availability  for  a  general  class  of  fixed  mission  time,  two-state  systems  including,  as  special 
cases,  series-parallel,  stand-by  and  others  that  appear  in  the  applications.  Further,  a  fixed  mis¬ 
sion  length  is  assumed.  It  is  also  assumed  that  neither  reliability  growth  nor  deterioration  occur 
during  the  life  of  the  system  and  the  system  becomes  as  good  as  new  after  each  repair.  Finally, 
we  assume  that  no  environmental  changes,  which  could  affect  reliability  occur.  The  posterior 
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distributions  obtained  are  exact  in  theory  and  their  numerical  evaluation  is  limited  only  by  com¬ 
puting  resources,  data  representation  and  round-off  in  calculation.  The  present  paper  collects 
and  generalizes  previous  results  of  the  authors  and  others. 

The  methods  obtained  in  the  following  apply  both  to  reliability  and  steady  state  availability 
analysis  and  to  avoid  repeated  reference  to  "reliability  or  availability”,  the  discussion  references 
only  reliability  with  the  understanding  that  the  terms  system  reliability  R  and  component  relia¬ 
bility  rj  can  be  replaced  by  system  availability  A  and  component  availability  a,.  The  conceptual 
development  requires  only  that  R  and  A  be  probabilities  defined  in  terms  acceptable  for  a  par¬ 
ticular  application.  The  emphasis  is  on  the  determination  of  the  posterior  distribution  func¬ 
tions.  Having  these,  the  calculation  of  point  estimates  and  confidence  intervals  is  routine. 

BAYES  CONFIDENCE  INTERVALS 

In  the  Bayes  inference  model,  the  unknown  probability,  R,  0  <  R  <  1,  is  considered  a 
random  variable  whose  posterior  density  is  the  result  of  combining  prior  information  with  test 
data  to  obtain  a  probability  density  function  /(R)  for  R.  If  the  posterior  density  of  R  is  seen  to 
be  spread  out,  then  relatively  more  uncertainty  in  the  value  of  R  obtains  than  when  the  poste¬ 
rior  density  is  concentrated  closely  about  some  particular  value.  The  posterior  density  function 
provides  the  most  complete  form  of  information  about  R ,  but  sometimes  summary  information 
is  desired.  A  point  estimate  is  one  such  form  of  summary  information  and  this  can  be  selected 
in  various  ways  and  is  analogous  to  the  familiar  statistical  problem  of  characterizing  an  entire 
population  by  some  parameter  value.  Examples  are  mean,  mode,  median,  etc.  A  point  esti¬ 
mate  has  the  disadvantage  of  ignoring  the  information  concerning  the  uncertainty  in  the  un¬ 
known  reliability.  Confidence  intervals  derived  from  /(R)  provide  such  additional  information. 
The  true  but  unknown  (and  unknowable  except  with  infinite  data)  reliability  R0  is  some  specific 
value  of  the  random  variable  R,  0  <  R  <  1.  Conceptually,  R0  can  be  considered  a  random 
sample  from  0  <  R  <  1  made  when  the  system  was  built.  We  can  never  know  that  R0  is,  but 

J»  R 

/(R)dR  denotes  the  distribution  function  of  R  then 

Prob  [R\  <  R0  <  R21  -  F(R2>  -  F(R,) 

and  [/?,.  R2)  is  an  interval  estimate  of  R  of  confidence  c  -  F(R2)  -  F(R\).  The  interpreta¬ 
tion  is  simply  that,  based  on  the  prior  and  current  data  the  probability  is  c  that  the  unknown 
system  reliability  lies  between  R\  and  R2.  The  interval  [R)p  R21  has  been  called  125]  a  Bayes  c 
level  confidence  interval.  For  R2~  1,  R|  is  called  the  lower  c  level  confidence  limit.  For 
Ri  =  0,  R 2  is  called  the  upper  c  level  confidence  limit.  Given  f(R)  and  F(R),  Bayes 
confidence  limits  for  any  c  can  be  obtained  by  graphical  or  numerical  methods  and  the  pro¬ 
cedure  is  generally  not  difficult.  Numerical  examples  and  discussion  of  numerical  methods  are 
given  in  [25,27,8,26,28,291. 

DEFINITION  OF  STRUCTURE  FUNCTION 

To  establish  the  relationship  between  the  reliabilities  of  the  components  of  a  system  and 
the  reliability  of  the  entire  system,  the  way  in  which  performance  and  failure  of  the  com¬ 
ponents  affects  performance  and  failure  of  the  system  must  be  specified.  For  this  purpose,  as 
in  [5,10,151,  the  state  of  any  component  is  coded  1  when  it  performs  and  0  when  it  fails.  The 
state  of  all  M  components  of  the  system  can  then  be  coded  by  a  vector  of  N  coordinates 

X  -  (Xp  X2 . Xjv) 
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where  x,  -  0  means  the  /-th  component  fails  and  x,  —  1  means  that  it  does  not  fail.  All  possi¬ 
ble  states  of  the  system  are  represented  by  the  2N  different  values  this  vector  can  assume. 

Where  an  explicit  mission  time  dependence  is  required,  a  random  process  yU)  - 
lyi(r),  ....  >wO)l  can  be  defined  as  in  (IS]  so  that  to  each  component  trajectory  a  measure  x, 
is  assigned.  Then,  for  example:  x,  -  1  if  >,(/)  is  a  failure-free  process  over  some  interval 
0  <  Tn.  <  f  <  r,2.  and  x,  -  0  if  at  least  one  failure  occurs. 

Some  of  the  2A  states  cause  the  system  to  fail  and  the  others  cause  the  system  to  perform. 
The  response  of  the  system  as  a  whole  is  written  as  a  function  <t>(x)  of  x  such  that  <t>(x)  -=  0 
when  the  system  is  failed  in  state  x,  and  <f>(x)  =  1  when  the  system  performs  in  state  x.  This 
function  </>(x)  is  known  in  the  literature  [5, 10,23)  and  has  been  called  a  structure  function  of 
order  N. 

The  structure  function  can  be  written  in  a  systematic  way  for  any  series  parallel  system. 
When  the  system  is  not  too  large  the  structure  function  can  also  be  written  by  observation  for 
many  more  general  systems.  The  structure  function  can  always  be  written  for  a  system  of  N 
components  by  enumeration  of  its  2A  states.  For  large  systems  this  is  at  best  very  tedious,  but 
generally  short  cuts  can  be  found  which  simplify  the  process.  The  structure  function  is  con¬ 
venient  for  conceptual  development  of  the  theory  and  provides  a  very  general  notation  which  is 
why  it  is  used  here.  What  is  required  in  the  application  of  the  present  results  is  the  formula  for 
system  reliability  in  terms  of  component  reliabilities  as  is  done  in  [25,27,  and  8J.  The  structure 
function  provides  this  formula  in  a  general  form  but  other  methods  are  available.  Some  of 
these  methods  are  identified  and  referenced  in  [17]  along  with  a  new  and  useful  algorithm 
based  on  graph  theory. 

DEFINITION  OF  RELIABILITY  FUNCTION 

Assume  that  the  components  of  the  system  are  failure  independent  so  that  the  elements 
of  the  state  vector  x  =  (x| . X/y)  are  independent  random  variables  with  probability  distri¬ 

butions 

Pr\Xj  =  1)  =  r, 

Pr\Xj  -  0}  =  1  -  r, 

where  r ,  is  the  reliability  of  the  /-th  component. 

The  structure  function  4>(x)  is  also  a  random  variable  with 
Pr\<t>(x)  -  1)  -  R 
Pr\<t>(x)  =  0}  -  1  -  R 

where  R  is  the  reliability  of  the  system.  R  is  the  expected  value  of  </>(x)  so  that 

(1)  R  «  £[*(*))  =  £  *(x>r,''  (1  -  r,)  ,-Jr'  ...  rN  *  (1  -  rN) 

where  the  summation  is  over  all  2N  states  of  the  system. 

In  a  particular  application  given  the  structure  function  and  the  values  of  all  component 
reliabilities,  the  system  reliability,  R ,  can  be  computed  explicitly  using  (1).  References  [5], 
[10],  and  [23]  provide  further  discussion  with  examples  of  <t>  and  R. 
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RELIABILITY  ESTIMATION  FROM  TEST  DATA 

In  many  applications  the  system  structure  is  known  but  some  or  all  of  the  component  reli¬ 
abilities  are  unknown  and  must  be  estimated  from  tests  and  other  data.  As  a  result,  statements 
concerning  these  component  and  system  reliabilities  are  subject  to  the  uncertainties  of  statistical 
estimation.  A  method  of  treating  this  uncertainty  is  provided  by  a  Bayes  analysis  which  consid¬ 
ers  the  unknown  component  reliabilities  as  random  variables  and  leads  to  Bayes  confidence 
intervals  for  both  component  and  system  reliabilities.  The  following  is  an  extension  and  gen¬ 
eralization  of  previous  analysis  of  this  kind  [25,27,8,7,14,23,29]. 

BAYES  MODEL 

Assume  a  system  of  S  failure  independent  components  has  a  known  structure  function 
4>(x)  and  reliability  function  /?(/•)/•-  (r,,  ....  rN)  of  the  form  (1).  Suppose  that  among  the 
S  separate  components  of  the  system  some  are  known  to  have  identical  reliabilities  say  ij,  and 
k,  for  example,  then  since  r,  -  r,  -  rk,  the  symbols  r,  and  rk  can  be  replaced  by  r,  everywhere 
in  (1).  Finally,  in  this  way  there  remain  only  S'  <  S  different  r' s,  one  of  each  reliability 
value.  In  addition,  suppose  that  among  the  S'  different  component  reliabilities  S'-n  are 
known  constants  and  thus  there  remain  n  different  types  of  components  with  unknown  reliabili¬ 
ties.  By  a  simple  change  in  notation  these  n  different,  unknown  reliabilities  are  denoted  by 
P  -  iP\,P2>  •  P»)- 

By  multiplying  out  factors  (1  -  p)  and  collecting  terms,  the  system  reliability  (1)  can  then 
be  written  in  the  equivalent  form 

(2)  R(p)  -  X  a0  p{  ...  p„ 

i 

where  the  constants  a,j,  are  integer  for  /  &  0. 

Using  a  Bayes  inference  model,  the  unknown  p,  are  considered  independent  random  vari¬ 
ables  with  known  posterior  density  functions, 

f(p,),  0  <  Pi  <  1,  /'  -  1 . n. 

The  system  reliability,  R(.p)  is  then  also  a  random  variable,  defined  by  (1)  with  unknown 
distribution  function  H(R). 

In  applications,  what  is  required  is  the  calculation  of  H(R)  given  the  fip,)\ 

i  -  1.  2 . rt.  Having  obtained  //(/?),  point  estimates  and  confidence  intervals  on  R  can  be 

obtained  directly.  This  result  is  also  required  for  risk,  cost  and  other  analyses  based  on  the 
Bayes  model.  The  method  for  an  explicit  numerical  evaluation  is  presented  in  the  following 
section. 

EVALUATION  OF  THE  POSTERIOR  DISTRIBUTION 

The  proposed  method  of  evaluating  the  posterior  distribution  function  H{R)  is  based  on 
an  expansion  of  H(R)  in  Chebyshev  polynomials  of  the  second  kind  [1,16].  The  main  advan¬ 
tages  of  this  method  lie  in  the  rapid  convergence  properties  of  the  Chebyshev  expansion  and 
the  convenient  numerical  computation  for  its  evaluation.  Although  a  description  of  the  pro¬ 
cedure  has  been  presented  in  [8]  and  [7],  for  the  sake  of  completeness,  we  shall  outline  the 
main  steps  below. 
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Expansion  by  Chebyshev  Polynomials 


Let  H(R )  denote  the  posterior  distribution, 

H(R)-  f*  h(R)dR,  0  <  R  <  1, 

where  h(R)  is  the  posterior  density  of  the  reliability  of  the  overall  system.  By  definition, 
H(R)  satisfies  the  boundary  conditions: 

(3)  H( 0)  -  0;  H(  1)  -  1 

Let  us  introduce  a  new  function  Q(R)  defined  by 

(4)  Q(R)-  H(R)-  R 

the  Q(R)  satisfies  the  boundary  conditions 

(5)  0(0)  -  0(1)  -  0 

and  can  be  expanded  in  a  Fourier  sine  series  of  the  following  form: 

(6)  Q(R) 

71 

sin  (A:  +  1)0 


4  .  ,  sin 20  . 

—  sin  0[6n  +  bi  — : — ~  4-  — 
ir  loi  sinfl 


+  bk 


sin  9 


+  ...] 


where  the  angular  variable  0  is  related  to  R  by  the  relation 


(7) 


R  -  cos2 


r 


The  coefficients  bk  of  the  expansion  (6)  can  be  determined  by: 

(8)  bk  -  IH(R)  ~  R]  Ut  (*)dR 

where  U*k  (/?)  =  sin(*  +  1)g 
sin0 

which  can  be  computed  by  the  recursion  relations: 

(9)  U*k+X  ( R )  -  (4/f  -  2)  Ut  (* )  -  Ut- 1  (/?) 


is  the  shifted  Chebyshev  polynomial  of  the  second  kind  [1,16] 

Accession  For 


with 


7/3(/?)=l  (/?(/?)  = -2  + 4/? 

i/f  (/?)  =  3  -  16 R  +  16 R2 

If  we  express  U*  ( R )  explicitly  as  a  fcth  order  polynomial 

(10)  Ut  (. R ) 

then  Equation  (8)  becomes 


I  CikRk 

i-O 


OTIS  GRA&I 
BDC  TAB 
Unannounced 
Justification 


r 


By _ 


(ID 


bk  -  i  qJ/0‘  R'H(R)dR- /0‘  R'+'dR  . 


It  can  be  shown,  integrating  by  parts,  that 

(12) 


L^valipi^t.y  Cod 
Avail aud/or 
special 


A^(//(/?)}  -  -rjj  {1  -  Mj+\\h (/?)]). 
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Thus,  Equation  (11)  becomes 


(13) 


l-A/j+1  [/K/Ol 

k~h  *  m 


Note  that  the  Chebyshev  coefficients  Cik  can  be  computed  independently  of  the  moments. 
They  may  be  stored  in  the  form  of  a  triangular  matrix  if  sufficient  storage  space  is  available.  A 
simple  algorithm  for  recursively  calculating  the  coefficients  is  Q*+i  *  -2 Cik  — Ca_|. 


Computations  and  Results 

To  complete  the  analysis  it  remains  to  compute  the  moments  of  h(R)  given  the  density 
functions  f,{p,)  and  then  use  (13)  to  compute  the  br. 

From  (2)  Rk(p)  k  -  1, 2,  ...  can  be  written  as  a  finite  sum 

(14)  Rk(p)  =  £  aojkpx  ...  pR 

j 

where  the  a,jk  are  independent  of  the  p,  and  also  integers  for  /  ^  0.  Using  this  result  and  the 
fact  that  the  expected  value  of  a  sum  is  the  sum  of  the  expected  values  and  the  expected  value 
of  a  product  of  independent  random  variables  is  the  product  of  the  expected  values,  it  follows 
that 

(15)  Mk\h\  =  ?.a0jkMa]ik  ...  Ma„, 

j 

where  Ma  denotes  the  aijk  'th  moment  of  p,. 

Having  determined  the  coefficients  bk  we  can  write  down  the  final  expression  for  H(R ) 
from  Equations  (4)  and  (6)  as  follows: 

(16)  H{R)  -  R  +  —  y/R(\-R)  |/>o  +  btU*(R)  +  ...  +  bk  U'k(R)  +  ...). 

7T 

This  result  is  exact  in  the  sense  that  the  error  can  be  made  arbitrarily  small  by  taking  a 
sufficient  number  of  terms.  References  18]  and  [7]  give  a  discussion  of  numerical  considera¬ 
tions  and  examples.  Generally,  (16)  has  been  found  very  convenient  for  numerical  calculation 
using  an  electronic  digital  computer. 


MODELS  FOR  APPLICATION 

To  evaluate  H(R)  the  posterior  distribution  f(p,)  for  each  different  component  reliability 
P,  is  required.  The  derivation  of  these  require  application  of  Bayes  inference  procedures  on  a 
case  by  case  basis.  The  theory  can  be  found  in  [20,4,19,2,3,24,6,18]  and  some  specific  applica¬ 
tions  in  [25,27,8,7,14,1,16,12,23].  A  tabulation  for  some  familiar  models  of  mathematical  reli¬ 
ability  theory  is  presented  in  the  following. 

Component  With  Constant  Failure  Rate 

A  single  component  has  an  unknown  constant  failure  rate  X  and  fixed  mission  time  /. 
Component  reliability  p  -  exp(-Ar)  is  regarded  as  a  random  variable.  The  natural  conjugate 
prior  density  function  is 
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with  parameters  b0  and  r0.  When  test  data  consists  of  f  operating  hours  after  r  failures, 

T  ”  't  +  h  +  • . .  +  tr  +  (m  -  r)tr. 

Here  tr  is  the  time  of  the  r-t h  failure  among  m  initially  on  test.  Failures  are  not  replaced  and 
tne  test  is  terminated  at  the  Mh  failure.  The  resulting  posterior  density  function  of  p  is 

f(p\a.b-a0.b0 )  -  pb(\n\/p)°, 

0<  p  <  1, 

where  a  =*  r  +  r0  and  b  =  T/t  +  ba.  The  Ar-th  moment  of  f(p)  is 
Mk{f]  =  (b  +  l)fl+1  (k  +  b  +  1)-*-'. 

The  above  results  are  from  Reference  1251. 

Component  Having  Fixed  Probability  of  Success 

A  single  component  has  an  unknown  fixed  probability  of  success,  p.  In  testing,  there 
were  observed  m  successes  in  n  trials.  For  the  natural  conjugate  Beta  prior  density  function 
with  parameters  m0  and  n0  the  posterior  density  function  of  p  is 

/(p\aM-  g-a+l  >+|)  P’U-  t>)‘ 

where 

a  =  m  +  ma,  b  -  n  +  n0  -a  and  B(a  +  1,  b  +  1)  «  f '  p°(\  -  p)»  dp. 

The  A-th  moment  of  f(p\a,b), 

k  =•  0. 1,2,  ...  is: 

M  {f]  =  (*>-<>  +  D!  (o  +  k)l  =  r(/>  -  a  +  2)  r(a  +  k  +  1) 

o'-  (b  —  a  +  k)\  r(a  +  l)  r(b-a  +  k  +  l)’ 

This  result  is  from  [26]. 

Steady  State  Availability  of  Component  With  Repair 

A  two  state  component  has  exponential  distributions  of  life  and  of  repair  times.  The 
duration  of  intervals  of  operation  and  repair  define  two  different  statistically  independent 
sequences  of  identically  distributed,  mutually  independent  random  variables.  Boih  the  mean-up 

time,  1/A,  and  mean  repair  time  l/p  are  unknown  parameters  estimated  from  test  and  prior 
data. 


i.e.: 


The  long  term  availability  of  the  component  is  a  function  of  the  random  variables  p  and  A 


a  -  p/(\  +  p). 

Assuming  gamma  priors  for  A  and  p  with  snapshot,  life  and  repair  time  data,  the  posterior  den¬ 
sity  of  availability  a  is  the  Euler  density  function: 


/(fllr.iv.R) 


(1  ~8)K  g""1!!  ~  a)'"1 
B(r,w)  (1  -RaY*" 

0  <  a  <  1;  r  >  0;  w  >  0,  t#l  <  1. 
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The  parameters  r,  w  and  8  are  determined  by  test  data  and  prior  information  as  defined  in 

[25] . 

The  moments  of  /(a)  are  given  in  [25]  in  terms  of  Gauss’  hypergeometric  function  2^1 
(w  +  r,  w  +  A;  w  +  r  +  A;  8).  (Note  the  typographical  error  in  [25]  where  A  in  2^1  is 
replaced  by  r.) 

A  special  case  of  this  availability  model  treating  only  "snapshot"  data  is  given  in  [28). 
Snapshot  data  defined  in  [25,28]  records  only  the  state  of  the  system  (up  or  down)  at  random 
instants  of  time. 

RULES  OF  COMBINATION  FOR  SOME  BASIC  SYSTEM  ELEMENTS 

Components  are  often  combined  to  form  system  elements  which  are  special  in  some 
sense.  For  example,  the  same  multicomponent  element  may  appear  several  times  as  a  unit  in 
the  same  system.  In  this  case,  it  may  be  convenient  to  treat  the  element  as  a  single  system 
component.  Some  simple  multicomponent  system  elements  are  presented  in  the  following: 

N  Identical  Components  in  Series 

The  reliability,  p,  of  A'  identical  components  in  series  is  p  -  p*. 

Component  reliability  p\  is  a  random  variable  in  the  Bayes  representation  with  known  pos¬ 
terior  density,  f\(p\).  The  moments  A/*il/ih  A  =  0, 1,  ...  ;  of  J\ (/>,)  are  then  also  known. 
The  moments  A/*|/>  of  the  posterior  density  f(p)  of  p  are  related  to  moments  of  the  f,  by 

A/*  I/I  -  A/v*.  1  {/,);  A  -  0. 1, 2 . 

Using  this  result  one  can  write  the  moments  of  the  posterior  density  of  series  combinations  of 
any  of  the  special  components  treated  in  the  previous  section. 

N  Identical  Redundant  Components 

When  only  one  is  required  to  operate  in  order  that  the  system  operates,  then  the  reliabil¬ 
ity,  p,  of  N  identical  failure  independent  redundant  components  is  p  *=  1  -  (1  -  P\V  where  P\ 
is  the  Bayes  representation  of  the  component  reliability  p\.  It  is  shown  in  [8]  that  the  moments 
Mk\f]  of  the  posterior  density  f(p)  of  p  are  related  to  the  moments  Mk\\.f\)  of  the  posterior 
density  f\(p\)  of  p\  by  the  relation 


By  alternately  applying  this  result  and  the  previous  one  for  components  in  series,  the 
moments  of  the  posterior  density  of  any  series  parallel  system  of  components  can  be  obtained. 

A  ”2  out  of  3"  Element 

An  element  consisting  of  three  identical  failure  independent  components,  which  operates 
if  any  two  or  more  of  the  components  operate,  is  sometimes  called  a  "2  out  of  3  voter,"  [21]. 
The  structure  function  of  this  element  is 

<j>(X\,X2,X})  =  I  if  JY|  +  Jf2  +  x}  ^  2 

=  0  if  A|  +  A'2  +  A3  <  2 
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and  the  reliability  p  is 

p  =  3 p}  -2p? 


where  p\  is  the  component  reliability.  If  the  posterior  density  f\(p\)  of  P\  has  moments 
then  the  moments  Mk[f)  of  the  posterior  density  f(j>)  of  p  are: 


Mk[f\  -  3*  £ 

7"0 


j  Mj+2k.  ll/ll- 


This  result  follows  using  the  fact  that  for  p  =  p\,  Mp\f{p))  =  MNk  , {./j | ,  when  applied  term  by 
term  to  the  expansion  of 


(3 p\  -2 p\)k. 


Reference  [21]  gives  the  reliability  function  of  the  A'-tuple  Modular  Redundant  design 
consisting  of  A/  replicated  units  feeding  a  (»  +  D-out-of-A'  voter.  This  case  can  also  be  treated 
by  the  present  methods. 

Exactly  L  Out  of  N  Element 


An  element  consisting  of  A/  identical  failure  independent  components  which  operates  only 
when  exactly  L  out  of  N  components  operate  is  a  rather  unusual  system.  If  L  +  1  out  of  N 
operate  the  system  fails.  Such  a  system  is  not  a  coherent  structure  in  the  sense  of  [5],  The 
reliability  p  of  this  element  is  given  by 


Al 


Pi  0  ~  Pi) 


A  -L 


The  moments  of  the  posterior  density  f(p)  of  the  Bayes  representation  p  in  terms  of  the 
moments  MkA\fx\  of  the  posterior  density  f\(p\)  of  the  component  reliability  p\  can  be  shown 
to  be 


Mk  I/)  = 


1  N 

k 

W-L)k 

(A f-L)k 

u 

I  (-1V 

7-0 

j 

AWil/i)- 


This  example  serves  to  illustrate  that  the  proposed  evaluation  is  not  restricted  to  coherent  sys¬ 
tems. 


DEVELOPMENT  OF  AN  APPROXIMATE  PRIOR  FOR  TESTING  AT  SYSTEM  LEVEL 

Section  9.4.4  of  NAVORD  OD  44622,  Reference  [22],  presents  a  procedure  for  develop¬ 
ing  the  posterior  beta  distribution  of  system  reliability  for  system  level  TECHEVAL/OPEVAL 
testing.  Reference  [9]  presents  further  discussion  with  an  example.  The  observed  system  level 
data  is  binomial  i.e.,  r  failures  in  n  trials.  The  system  level,  natural  conjugate  prior  is  the  beta 
density.  An  exact  prior  for  the  system  level  tests  is  the  posterior  density  function  based  on  all 
prior  component  tests  and  component  priors  and  can  be  computed  by  the  methods  above.  The 
procedure  recommended  in  OD44622  is  to  approximate  the  exact  system  prior  with  a  beta  den¬ 
sity  having  the  same  first  and  second  moments. 

Equation  (15)  above  provides  a  tractable  tool  for  computing  the  required  first  and  second 
moments  for  extending  the  method  to  arbitrary  system  structures. 

Let  M\  and  M 2  denote  the  first  and  second  moments  computed  as  shown  in  this  report 
for  the  posterior  density  f(R )  of  system  reliability,  R,  based  on  prior  component  data.  The 
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f(R)  is  considered  the  exact  prior  for  determinations  of  a  new  posterior  density  based  on  bino¬ 
mial  system  level  data.  What  are  required  for  the  approximation  are  the  parameters  n'  and  r' 
of  the  beta  prior 


/?"-'(!  -  RY 
Bin'-  r'+  1.  r'+  1) 


with  the  same  first  and  second  moments  as  /(/?).  Having  computed  M,  and  M2  the  answer  is 
direct  using  formulas  on  page  9.23  of  NAVORD  OD  44622  i.e.. 


n‘  -  [A/,(l  -  A/,)/(A/2  -  A/,2)]-l 
r'=  (1  -  M,)n’. 


The  gamma  prior  is  treated  in  a  similar  way  in  the  same  reference. 


The  beta  approximation  can  also  be  used  directly  as  an  approximation  to  the  exact  poste¬ 
rior  density  function  for  complex  systems  based  on  component  test  data.  The  approximation 
has  been  very  good  when  compared  with  the  exact  result  in  examples  treated  by  the  authors. 
The  calculation  is  tractable  for  hand  computation  since  only  the  first  and  second  moments  of 
the  exact  posterior  density  function  are  required. 

Numerical  Example 

Consider  a  system  consisting  of  five  components,  At  (/  =  1 . 5)  connected  in  series. 

Components  A,,  A2,  A},  and  A4  have  unknown  fixed  probabilities  of  success,  p,\  and  in  testing, 
there  were  observed  m,  successes  in  n,  trials.  The  fifth  component,  As,  has  an  unknown  con¬ 
stant  failure  rate  K  and  has  mission  time  i.  In  testing,  component  A5  failed  r  times  in  T  operat¬ 
ing  hours.  The  following  test  data  were  observed: 

h,  =  20.  w,  =  18;  «2=30,  w2=  25;  20,  m3=  20;  n4=  20.  m4=19;7'=38.  6,  r  =  3 


The  resulting  posterior  density  functions  are: 
/,(/?,)  -  3990  A? i1'  (1-  R ,)J 
f2(R2)  -  4417686  Rp  (1  -  R2)~ 
f,(R j)  -  21  R? 

/4(/?4)  -  420  Rl9  (1  -  R4) 

/5(/fs)  =  482.00823  RPn  |ln-~ 


We  know  [25,26]  that  the  Mellin  integral  transform  of  the  posterior  density  function, 
h(R )  for  the  system  is  the  product  of  the  Mellin  integral  transforms  of  the  density  functions  of 
the  components.  At  this  point  we  can  determine  h(R)  exactly  by  means  of  the  inverse  Mellin 
integral  transform  or  we  can  approximate  h(R)  with  a  Beta  density  function  having  the  same 
first  and  second  moments  as  h{R). 

The  Mellin  integral  transforms  of  the  density  function  for  the  components  of  the  system 
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M(/j(/J2)|5] 

A/[/3(/?3)|S] 


31!  r(5  +  25) 
25!  T(S  +  31) 

21!  TPS  +  20) 
20!  T(S  +  21) 


M(/5(/?5)ls] 


(22/3)4 
(S  +  19/  3)4 


The  Mellin  Integral  transform  of  h (R)  is  M[/»(/?)|S]  = 


rt 

(-1 


A/l/;(/?()|si. 


From  [26]  we  know  that  the  Mellin  inversion  integral  yields  directly 
HR)=  -  f‘“  R'sM[h(R)\S]dS 

where  the  path  of  integration  is  any  line  parallel  to  the  imaginary  axis  and  lying  to  the  right  of 
the  real  part  of  c.  If  b  is  greater  than  1,  the  real  part  of  c  is  greater  than  p,  and  p  is  any 
number,  then,  [26] 


1  r‘*'“  R~s  =  Rp 

2 iri  (5  +  p)b  f(6) 


To  find  h(R)  we  simply  write  A/[/t(/?)|S]  as  the  sum  of  its  partial  fractions  [13]  and 
integrate  each  term  using  the  above  equation.  Thus  the  exact  posterior  density  function,  fr(R), 
for  system  reliability  is 


h(R)  =  +  1094388844.948  R]*  +  30505643166.29  R I9. 


-  12601708553.76  R'9 

-  19915799047.82  R20 


lni 

'"7 


-  31650550963.66  R20 


-  5114357474.61  R 


20 


■4 


+  235122603.404  R25  -  354959810.01  R2b 
+  249501799.456  R21  -  98389473.63  R2i 
+  21240815.37  R29  -  1974044.939  R30 


-  22937.221  R'9/3  +  78073.717  R'9'3 


-  95839.296  Rl9/i 


+  42683.275  RWi 


The  exact  distribution  function,  H(R),  is  found  by  integrating  the  density  function. 


To  obtain  the  approximate  solutions  for  the  system  reliability  density  and  distribution 
functions,  we  recall  that  the  first  and  second  moments  of  h(R)  are  given  by  M\h(R)\2\  and 
A/[/r  (/? ) |3]  respectively.  The  beta  density  function,  which  is  used  to  approximate  h(R),  is 


h(R) 


R°(  1  -  R)b 
3  (a  +  1,  b  +  1) 


where  h(R)  denotes  the  approximate  system  density  function,  /3(o  +  1,  6+1)  is  the  com¬ 
plete  beta  function,  and  a  and  b  are  the  parameters  of  the  beta  function.  The  first  moment  of 


h(R)  is 
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a  +  1 
a  +  b  +  2 


and  the  second  moment  is 


( a  ±  1)  (a  +  2) 

(a  +  b  +  2)  (a  +  b  +  3) 

We  require  that  the  first  and  second  moments  of  h(R)  and  h(R)  be  equal.  Thus  we  have 


M[h(R)\2) 

A/(/j(/0|3] 


a  +  1  = 

a  +  b  +  2  = 

(a  +  1)  (a  +  2) 

(o  +  b  +  2)  (a  +  b  +  3)  ' 


Solving  simultaneously  for  a  and  b  yields  the  parameters  for  the  beta  density  function.  Thus 
we  have  a  —  6.43596  and  b  —  1 1.92734.  Therefore  we  can  now  write  ii(R),  the  approximate 
density  function  for  system  reliability: 

h(R)  R6A3S%  il  ~  *)U  92734 

B  (7.43596,  12.92734)  ' 

To  determine  the  approximate  distribution  functions,  H(R ),  for  system  reliability  we  simply 
integrate  h(R) 

Table  I  provides  the  comparison  between  the  results  obtained  by  the  exact  solution  and 
the  approximate  solution. 

TABLE  1  —  Numerical  Results  Obtained  from  Exact 
and  Approximate  Solutions 


R 

Density  Function 

Distribution  Function 

Exact 

h(R) 

Approximate 

h(R') 

Exact 

H(R) 

Approximate 

H(R) 

.0 

.0 

.0 

.0 

.0 

.10 

.079 

.057 

.001 

.001 

.20 

1.213 

1.208 

.052 

.048 

.30 

3.243 

3.339 

.278 

.281 

.40 

3.429 

3.382 

.635 

.641 

.50 

1.667 

1.616 

.896 

.895 

.60 

.343 

.365 

.986 

.984 

.70 

.020 

.032 

.999 

.999 

.80 

.003 

.001 

1.000 

1.000 

.90 

0.000 

0.000 

1.000 

1.000 

1.00 

0.000 

0.000 

1.000 

1.000 
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ABSTRACT 

A  single  component  system  is  assumed  to  progress  through  a  finite  number 
of  increasingly  bad  levels  of  deterioration.  The  system  with  level  t  (0  <  /'  <  n) 
starts  in  state  0  when  new,  and  is  oefinitely  replaced  upon  reaching  the  worth¬ 
less  state  n.  it  is  assumed  that  the  transition  times  are  directly  monitored  and 
the  admissible  class  of  strategies  allows  substitution  of  a  new  component  only 
at  such  transition  times.  The  durations  in  various  deterioration  levels  are 
dependent  random  variables  with  exponential  marginal  distributions  and  a  par¬ 
ticularly  convenient  joint  distribution.  Strategies  are  chosen  to  maximize  the 
average  rewards  per  unit  time.  For  some  reward  functions  (with  the  reward 
rate  depending  on  the  state  and  the  duration  in  this  state)  the  knowledge  of 
previous  slate  duration  provides  useful  information  about  the  rate  of  deteriora¬ 
tion. 


Many  authors  have  studied  optimal  replacement  rules  for  parts  characterized  by  Marko¬ 
vian  deterioration,  for  example  Kao  [6]  and  Luss  [9]  and  the  many  references  found  in  those 
papers.  Kao  minimized  the  expected  average  cost  per  unit  time  for  semi-Markovian  deteriorat¬ 
ing  system,  and  considered  various  combinations  of  state  and  age-dependent  replacement  rules. 

Luss  examined  inspection  and  repair  models,  where  he  assumed  that  the  operating  costs 
occurring  during  the  system's  life  increase  with  the  increasing  deterioration.  The  holding  times 
in  the  various  stales  were  independently,  identically,  and  exponentially  distributed.  The  policies 
examined  include  the  scheduling  of  the  next  inspections  (when  an  inspection  reveals  that  the 
state  of  the  system  is  better  than  certain  critical  state  k )  and  preventive  repairs  (when  an 
inspection  reveals  the  state  of  the  system  being  worse  than  or  equal  to  k).  The  convenience  of 
a  Poisson-type  structure  for  the  number  of  evenls-per-unit-time  made  it  relatively  easy  to  allow 
general  freedom  in  the  selection  of  observation  times. 

The  work  studied  here  is  based  on  a  modification  of  the  model  used  by  Luss.  Our  model 
for  deterioration  is  more  general,  but  the  admissible  strategies  used  here  are  more  restricted. 
Here  we  allow  the  exponentially  distributed  durations  to  have  different  mean  values,  and  to  be 
positively  correlated. 

•This  work  was  partially  supported  by  Grant  No,  N000I4-75-C-0858  from  the  Office  of  Naval  Research 
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The  introduction  here  of  correlation  between  interval  durations  permits  the  modeling  of  a 
rate  of  deterioration  which  can  be  estimated  from  a  particular  realization  of  the  past  durations. 
However,  the  lack  of  a  Poisson-type  of  structure  for  the  events-per-unit-time  makes  it  much 
more  difficult  here  to  allow  general  freedom  in  the  selection  of  observation  times.  At  present, 
only  the  simple  case  of  direct  and  instantaneous  observation  of  deterioration  jumps  has  been 
considered. 

This  mode!  would  be  appropriate,  for  example,  in  a  subsystem  which  functions,  but  with 
reduced  efficiency,  when  some  redundant  components  have  failed;  and  for  which  failure  of  one 
component  might  indicate  environmental  stresses  which  increase  the  probability  of  failure  for 
other  components.  In  addition,  deterioration  in  correlated  stages  might  be  used  as  a  simple 
approximation  for  a  continuously  varying  degradation  which  does  not  exhibit  discrete  stages. 

Figure  1  shows  a  typical  time  history  of  deterioration  and  replacement.  The  duration  in 
state  (/  -  I),  prior  to  reaching  state  (i),  is  r,_ t.  The  intervals  d{  in  Figure  1  represent  the  time 
required  to  replace  a  component  when  it  has  entered  state  /.  The  sequence  {r,J  will  be  Markov, 
characterized  by  a  multi-variate  exponential  distribution.  Reward  functions  will  be  related  to 
the  deterioration  state  and  the  time  spent  in  each  state.  The  decision  rule  specifies  whether  or 

not  to  replace  when  entering  each  state  /,  on  the  basis  of  the  history  of  r,_ (,  r,_2 .  The 

Markov  property  simplifies  the  decision  rule  to  be  a  collection  of  C,  sets  such  that  we  replace 
on  entering  state  /  if  and  only  if  r,_|  €  <£,. 


State 


The  objective  is  to  maximize  the  average  reward  per  unit  time: 

(I)  L  *  lim  ■]=,  (Total  reward  in  (0, D) 

r-ee  T 

^2)  _  £  [Reward  per  renewal]  _  JR 

£  [Duration  between  renewals)  $ 

(See  Ross  [111  page  160  for  equivalence  of  (1)  and  (2).)  The  mean  reward  per  renewal  is 
defined  here  as: 


-v..  . 


‘•  »> 
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in  which: 

A  -  state  at  which  replacement  occurs  (possibly  random). 

Py  =  replacement  cost  if  replaced  on  entering  state  N  (possibly  random). 
c,(t)  =  reward  rate  when  in  state  /. 

Figure  2  shows  several  reward  rate  time  functions  cU)  which  have  been  considered. 
When  one  of  these  c(t)  functions  is  specified  for  a  given  problem,  the  c,(l)  in  (3)  are  assigned 
values  fiic(t)  with: 

(4)  )30  >  >02  >  >/3„  =  0, 

to  assure  greater  reward  rates  in  less  deteriorated  states.  State  n  corresponds  to  a  completely 
failed  or  worthless  component. 


c(t) 


(a)  constant 
i 


(b)  linear 


(c)  constant-after  set  up 


FKil'Ri:  2.  Reward  rale  lime  functions. 


The  mean  duration  in  (2)  is  defined  as: 

(5)  f  -  £  |  £  r,  +  dy 

<-o 


to  include  a  possibly  random  time  dy  for  carrying  out  a  replacement  at  state  N. 


While  the  ultimate  objective  is  to  choose  (?,  to  maximize  the  L  defined  in  (1),  it  is  well 
known  that  a  related  problem  of  maximizing: 

(6)  £n<a)«9t  -a  91 

is  simpler  (I).  Indeed,  the  £,  which  maximize  L  will  be  identical  to  those  which  maximize 
£„<«  >  for  the  a  *  such  that: 


(a  *)  -  0,  where  (a)  A  max  £ Q(a). 

1 


(7) 
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Section  1  considers  a  case  in  which  it  is  found  that  deterioration  rate  information  is  not 
useful  (e.g.,  the  optimal  policy  is  independent  of  the  amount  of  correlation  between  successive 
state  durations). 

Sections  2  and  3  consider  other  penalty  cost  structures,  eg.,  assuming  that  more 
deteriorated  parts  are  rustier,  hotter,  or  more  brittle,  and  therefore  more  costly  to  replace.  In 
such  cases  the  optimal  policies  do  make  use  of  estimates  of  the  deterioration  rates  as  well  as  of 
observations  of  the  deterioration  level. 

The  Appendix  describes  useful  properties  of  the  multivariate  exponential  10  sequence 
which  is  used  to  model  the  correlated  residence  times  in  a  sequence  of  deterioration  states. 
These  durations  have  marginal  distributions  which  are  exponential  with  mean  values  r/,,  and 
correlations  p,,  ->  p1'-'1. 

1.  CONSTANT  REWARD  RATE-STATE  INDEPENDENT 
REPLACEMENT  PENALTIES 


The  constant  reward  rate  case  with  c,(/)-/3,  and  with  state-independent  replacement 
penalties  (p,  —  p,  d,  —  d)  is  particularly  simple  to  analyze.  We  will  see  that  as  long  as 
E[r, |r,_|,  r,_2,  ...  1  >  0  for  all  /,  even  if  the  r,  are  not  exponentially  distributed,  the  optimal 
rule  will  be  to  replace  the  deteriorating  part  upon  entering  some  critical  state  k*s  independent  of 
the  observed  durations  r,. 


Based  on  the  problem  statement,  the  optimal  decision  on  entering  stale  j  must  maximize 
the  mean  future  reward  until  the  next  renewal,  £,(a),  for  a  suitable  a.  Here: 


(8) 


£,(«)  -  E 


V-|  J.v-i 

Z  P,r,\r,-\  -a£|  Z  r,\r).[ 

l-J  [  !-l 


—  p  — ad. 


Immediately  after  a  renewal,  when  j  ~  0,  the  expectations  defining  £ ()(a)  are  unconditional. 
The  optimal  decisions  for  each  state  will  be  found  in  terms  of  a,  and  then  the  proper  a  *  (for 
producing  decisions  which  maximize  L)  is  the  one  for  which  the  maximum: 

(9)  max  £„(«*)-£,?  (a  *>- 0. 


Optimization  by  dynamic  programming  begins  by  considering  the  decisions  at  the  last 
step,  i.e.,  on  entering  state  (n  -  I).  There  are  two  choices,  to  replace  (/?)  or  not  to  replace 
(/?),  with  corresponding  values: 

£„_i  (a;/?)  —  —p  —  ad, 

£„-i («;£)  -  £l/3„-|r„_,|r„_j)  -  aE\rn„\\r„^]-p-ad 
-  £[(/3„_|-a)r„_,|rn_2)-p-ad. 

Clearly,  the  best  decision  is  not  to  replace,  if  and  only  if,  the  difference 
(12)  An_,(«;r„_2)  A  £„_,(<*;« )-£„_, (a\R) 

=  (/3„_,-a)£[r„_,|r„_2 J  >  0. 

is  non-negative.  The  sign  of  (12)  will  be  the  sign  of  </3„_ | -a),  due  to  the  non-negativity  of 
all  interval  durations.  Thus  the  best  decision  depends  on  a  and  the  reward  parameter  /3„_ j .  but 
not  on  the  previously  observed  duration.  Two  cases  will  be  considered  separately. 


(10) 

and: 

(11) 
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If  fJ„-\  ^  a  then  the  best  decision  at  state  (n  —  1)  is  not  to  replace.  We  will  now  explain 
why,  under  this  condition,  it  is  best  not  to  replace  at  any  state  less  than  n.  Consider  the  situa¬ 
tion  on  entering  (n  —  2).  We  have  already  shown  that  it  is  better  not  to  replace  on  entering 
(n  —  1).  Thus  the  choice  will  be  based  on  a  A„_2  of  the  form: 

(13)  A„_2(a;rn_3)  =  EUpn-2-a)r„-2  +  -a)r„-i  |rB_3). 

Here  we  have: 

(14)  (0„_2-a)  >  a)  >  0, 

by  assumption,  and: 

(15)  £[/■„..,  >  0  and  £l/-B_2|r„_3]  >  0, 

because  all  r,  ^  0  with  probability  one.  Thus  A„-2(a;rn_j)  >  0  for  all  >  0,  and  it  is  also 
better  not  to  replace  here.  This  argument  can  be  repeated  for  states  (n  -  3), 
(//  —  4) . 1,0. 

The  other  case  to  consider  is  /3„_|  <  a,  which  requires  replacement  on  entering  state 
(n  —  I),  if  the  system  ever  reaches  that  stale.  When  we  consider  the  decision  on  entering 
(m  -  2),  the  A„_2  is: 

(16)  A„_2(a;r„_,)  =  £l(/3„_2-a)r„_2|/-n_,], 

which  has  the  sign  of  (/3„_2-a).  If  (/3„_2-a)  <  0,  then  replacement  is  optimal  on  entering 
(n-2)  and  (n  -  3)  is  considered  next.  This  iteration  may  eventually  reach  a  state  (k  —  1) 
where  (/3*_(-a)  >  0  and  it  is  better  not  to  replace.  Arguments  similar  to  those  for  the 
/J„_i  —  a  >  0  case  show  that  nonreplacement  is  the  optimal  decision  at  all  states  preceding  the 
one  which  first  arises  as  a  nonreplacement  state  in  this  backward  iteration. 

In  summary,  in  the  constant  reward  rate-constant  replacement  penally  case  £0(«1  is  max¬ 
imized  by  a  decision  rule  which  says  replace  on  entering  some  state  k  ^  n  which  depends  on 
the  reward  parameters  (/3,)  and  the  a: 

(17)  k  =  min(/:(a  —  /3,)  >  0J. 

Finally,  we  must  choose  a  *  so  that  £o  («  *)  =  0,  where: 

(18)  £,?<«)- -p-ad  +  ‘x'  (/3,  —  «)£[/■,). 

o 


Figure  3  shows  a  typical  plot  of  £<*  («)  as  a  continuous,  piecewise  linear  curve  whose  zero 
crossing  (£.?  (a  *)  =  0)  defines  a  *  and  the  optimal  replacement  state  k*  for  maximizing  L. 

EXAMPLE.  Figure  3  shows  that  the  optimal  average  reward  per  unit  time  is  L  —  2y 

when  k*  =  3,  where  /30  =  5,  /3(  =  4,  /82  -  3,  -  2,  j84  =  1,  05  =  0,  p  -  5,  d  -  1,  tj,  -  2 

(/  =  0,1, 2, 3,4)  and  n  =  5.  From  Equation  (18),  the  optimal  A:  is  a  function  of  a,  which 
remains  constant  when  a  varies  over  each  interval  /8(+|  <  a  <  /3,,  as  shown  in  the  figure. 

2.  INCREASING  REPLACEMENT  PENALTIES-CONSTANT  REWARD  RATE 

Here  we  generalize  the  model  of  the  previous  section  by  allowing  the  replacement  cost  p, 
and  replacement  duration  d,  to  be  functions  of  the  replacement  state  (/'),  and  to  be  random. 
These  parameters  are  assumed  to  have  mean  values  £[/>,)  and  Eld, ]  which  are  convex  nonde¬ 
creasing  sequences  in  /',  corresponding  to  the  increased  difficulty  in  replacing  more  deteriorated 
parts  which  may  be,  e.g.,  rustier,  hotter  or  more  brittle.  We  also  assume  that  the  mean  dura¬ 
tions  are  ordered:  rjo  ^  t)i  ^  ^  corresponding  to  faster  transitions  of  more 

deteriorated  parts. 
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The  foregoing  assumptions,  together  with  properties  of  the  assumed  multivariate 
exponential  density  for  stage-durations  (see  Appendix),  lead  to  an  optimal  decision  policy  with 
a  nice  structure.  That  optimal  policy  prescribes  replacement  when  entering  jstate  j,  if  and  only 
if  <  r*_ j ,  where  the  decision  thresholds  are  ordered:  0  <  ro/^o  ^  <  r»-i/ 

Vn-\  =  00  ■ 


The  optimal  decision  on  entering  state  j  must  maximize  the  mean  future  reward  until  the 
next  renewal,  i.  e.,  £,(<*).  For  a  suitable  a,  we  have: 

(19)  £,(<*)  =  ^jiE,  0/  lo-i|  -  E\Pn  +  «dN\. 


For  notational  simplicity  we  define  e,  —  E[p,  +  adf]  and  note  that  e,  is  also  convex  and  nonde¬ 
creasing  since  we  are  only  interested  in  a  >  0.  The  optimal  decisions  for  each  state  will  be 
found  in  terms  of  a,  and  then  the  proper  a  *  (for  producing  decisions  which  maximize  L)  is  the 
one  for  which  the  maximum  £  vanishes: 


(20) 


£,?(<**)  -  -  evict  *)  + 


I 


f— 0 


r, 


-  0. 


Optimization  by  dynamic  programming  begins  by  considering  the  decision  at  the  last  step. 
Since  state  n  represents  a  failed  component,  we  definitely  replace  the  component  when  it  enters 
state  rt.  Next,  we  consider  the  decision  to  be  made  on  entering  state  n  —  1.  There  are  two 
choices:  to  replace  ( R )  or  not  to  replace  (/?),  with  corresponding  values 


(21) 

(22)  £„_,(«;*)  ~  £l/3„.1r„.,-«r,.1|rn_2l  -  en 

for  £„_  1  (a ).  Clearly,  the  best  decision  is  not  to  replace,  if  and  only  if, 

A„_i(r„_2)  A  £„_,(«;£)  -  £„_,(«.£) 
is  non-negative,  i.e., 

(23)  AB_,(rn_2)  =  (/3„_i  — «)  £{r„_i  |r,,_2]  +  (eB_,  -  e„)  >  0. 

Referring  to  (A-6),  A„_|(r„_2)  is  a  linear  function  of  r„-2,  with 

A„_|(0)  =  (p„-\—a)r)„-\il  -  p)  +  (e„_,  -  en). 

Figure  4  shows  the  possible  shapes  for  this  function.  There  can  be  no  downward  zero-crossing 
at  an  r„_2  >  0. 


Thus,  depending  on  the  numerical  values  of  the  parameters,  there  are  three  possible  kinds 
of  optimal  decision  rules  when  entering  state  in  -  1): 

(i)  replace  for  no  r„_2  if  A„_t  ^  0  for  all  r„_2  >  0 

(ii)  replace  for  any  r„_2  if  A„_i  <  0  for  all  r„_2  0 

(iii)  replace  if  and  only  if  r„*_2  >  r„_2  ^  0,  where  A„_i(r’_2)  “  0. 

In  other  words, 

(24)  C„-i(a)  -  [r„-2-  2I. 

where  r*_2  could  be  zero  (case  i)  or  infinite  (case  ii). 


'  k*  ' 
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Figure  4.  Possible  shapes  for  2). 


Next  we  consider  the  optimal  decision  when  entering  state  (n  -  2),  and  assuming  that  the 
optimal  decision  will  be  made  at  the  subsequent  stage.  We  consider  cases  of  (/3„_i  <  o)  and 
(/3„_i  ^  a)  separately. 

(a)  (/3„-|  <  a)  implies  replacement  on  entering  («  -  1),  so 

AB_2(r„_3)  -  (0„_2-a)  £(rB_2|rB_3l  +  (e„_2  - 

resulting  in  the  same  three  possibilities  listed  above  for  state  (n  -  1). 

(b)  for  >  a): 

(25)  -  e„.2  +  (Pn-2-a)Elr„.2\rn.3) 

+  f .  l(0B_,-a)  £[/•„_, I r„_2]  -  e„]  /(r„_2|rn_3)</rB_2 

rn-  2 

+  /0"-2  f(r„-2\rn-3)dr„-2 

Equation  (25)  can  be  simplified,  with  the  aid  of  the  notation  0r)+  —  max  Or,  0),  to  the  form 

(26)  AB_2(rB_3)  -  (e„_2  -  e„_,)  +  (/3„_2-a)  £[rB_2|r„_3] 

+  £[(AB-,(rB_2))+|rB-3l. 

Useful  comparisons  can  be  formed  if  normalized  variables  are  introduced,  namely 
Si  -  rj  17,;  «,(s,_i)  -  A,(r(_, 

We  now  prove 

(a)  6„_2(sB-3)  >  8„_|(s„_3) 

(b)  8 „_2(sn_3)  is  convex  with  at  most  one  upward  zero  crossing  at  an  s  >  0. 

There  is  no  harm  in  writing  8„_|(sB_3)  or  8B.,(s+)  instead  of  8 i (s„_2)  for  purposes  of 
comparing  functions. 
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To  prove  (a),  consider 

(27)  8„_2(s)  -  8„— i (s)  -  ((e„_2  -  e„_,)  -  (<?„_,  -  e„ »  +  £[(8n_I(s+»+|s] 

+  -a)T)n-2  -  (/3ffl_i-a)T/„_|]  £[s+|s]. 

where  s+  represents  the  normalized  duration  following  s. 

The  terms  on  the  right  side  of  (27)  are  nonnegative  due  to  the  convexity  of  the  e,,  (  )+ 

>  0,  (A-6),  and  the  assumed  orderings  of  the  /3,  and  t)(. 

This  completes  the  proof  that  (a)  is  true.  It  follows  immediately  that  if  (i)  (preceding  Eq. 
(24))  applies  for  state  (n  -  1),  then  it  is  also  optimal  not  to  replace  in  state  («  -  2)  or  any  ear¬ 
lier  state.  (Recall  /3„_!  <  j8„_2  <  . . .  ,  and  we  are  now  considering  a  <  /3„_i). 

To  prove  (b),  which  is  only  of  interest  when  an  r*_2  >  0  exists,  we  refer  to  the  theorem 
in  the  appendix.  The  test  difference  S„_2(s)  can  be  written  as 

(28)  8„_2(s)  -  £[<?*- 2  -  +  (pn_2-a)yn-2  s+  +  (8„_,(s+))+|s] 

in  which  the  integrand  has  the  properties  required  by  h(s)  in  the  theorem.  To  see  this,  we 
note  that  /•„*_2  >  0  implies  that  (8„_1(0))+  =  0,  so  the  integrand  is  nonpositive  at  s+  —  0. 
Thus,  8„_2(s)  has  the  shape  stated  in  (b),  implying  that 

(29)  =  {r„_3:  r„_3  ^  r„_j] 

where  r'^3  may  be  zero,  infinity,  or  the  nonnegative  value  defined  by  8„_2(r'_3/ vn-i)  0. 

The  foregoing  arguments  can  be  repeated  for  r„_4,  r„_ 5  . . .  r0  to  prove  that  the  optimal 
replacement  policy  has  the  form: 

Replace  on  entering  state  /,  if  and  only  if,  r,  <  r*  where 
0  <  ro/ Vo  <  r’/j) i  <  ...  ^  r*_i/i7n_,  =  °o. 

When  repeating  the  proof  for  earlier  stages,  the  ( )+  term  in  (27)  and  (28)  is  modified  to  the 
form,  e.g.,  [(8„_2(j+))+  -  (8n_,(s+))+l.  This  term  is  generally  nonnegative,  due  to  (a)  at  the 
preceding  iteration  (next  time  step);  and  it  is  zero  for  s+  —  0  when  proving  (b),  since  then  r*_3 

>  0.  Thus  the  basic  theorem  is  still  applicable. 

3.  Computational  Procedure 

The  preceding  section  derived  the  structure  of  the  optimal  decision  rule  for  the  case 
where  replacement  is  more  difficult  and  more  expensive  when  the  part  is  more  deteriorated. 
The  corresponding  optimal  decision  thresholds  can  be  formed  as  follows: 

(a)  choose  an  initial  a. 

(b)  Find  the  r’(a)  (i  -  «— 1,  n  -  2,  ...0)  recursively,  via  numerical  integration  of 
expressions  like  (26)  (where  r*_3(a)  is  defined  by  the  condition  AB_2(r,*_3)  -  0). 

(c)  Compute 

S-o  (a)  •*  —  e \  +  [(/9o  —  a)  Tq  +  (A|(ro))+]  /{r^dr^. 

(d)  If  |£o  (a) |  <  e,  for  sufficiently  small  «,  say  £max-a*-a:  otherwise  repeat  the 
computational  cycle  starting  with  a  new  a. 


“  *• 


368  L  SHAW.  C'L  HSU  AND  S  G  TYAN 

The  following  properties  of  £o  (a)  can  be  used  to  generate  an  a -sequence  which  con¬ 
verges  to  a  * 

1.  £0°(a)  is  monotone  decreasing,  since  £0(a)  has  this  property  for  a  fixed  policy  (see 
Eq.  (19));  and  if£o  (<*2)  >  £<? (<*i)  fora2  >  «i,  then  the  policy  used  to  achieve  £0  (02)  could 
be  used  to  achieve  an  £0(a|)  >  £0  (<*|)  —  a  contradiction. 

2.  When  p  =  0,  all  r*  are  zero  or  infinite:  replacement  always  occurs  on  arrival  at  a  criti¬ 
cal  state  /*.  Use  of  that  policy  will  achieve  the  same  average  reward  for  durations  having  any 
value  of  p.  Thus,  a  useful  bound  on  a*(p)  isa*(0)  <  a*(p)\  0  <  p  <  1. 

3.  When  p  =  1,  future  r,  are  completely  predictable  (Piw(r(-|/v_|)  —  0  in  (A-7)),  so 
a  *(1)  >  «*(p).  In  this  case  there  is  essentially  a  single  random  variable  r0,  and  the  r*  can  be 
calculated  without  the  need  for  numerical  integration  of  Bessel  functions. 

4.  NUMERICAL  EXAMPLE 

Table  I  lists  parameter  values  for  a  replacement  problem  which  fall  under  the  assumptions 
of  Section  2. 


TABLE  1  —  Numerical  Example  Parameters 


i 

0 

1 

2 

3 

4 

5 

p, 

5 

4 

3 

2 

1 

Vi 

1 

0.9 

0.8 

0.7 

0.6 

E\P,) 

2 

2.2 

2.4 

2.6 

2.8 

E[di\ 

1 

1.1 

1.2 

1.3 

1.4 

CASE  1  (p  =  0) 


Since  future  durations  are  independent  of  past  ones,  the  optimal  policy  replaces  when  a 
critical  state  /'*  is  reached.  The  general  optimal  reward  expression 


a  *(p) 


,r,  —  P\ 


r,  +  dN 


becomes,  in  this  case 


« 


(0)  =  max 

./ 


^0,  y,  -  E\p,\ 

0 


^  y,  +  E[d,\ 
(1 


max  A  (j ) 


Direct  evaluation  shows 
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CASE  2  (p  =  1) 

Since  r,  =  r0  vJvo  in  this  case,  the  optimal  rule  specifies  a  replacement  state  j(r0 )  as  a 
function  for  r0. 

For  any  such  policy 

£0(atJ(r0))  =  E  -Pj  -«</+—  ^  ri,(J3,-a) 
l  Vo  o 

This  expectation  will  be  maximized  if  j (r0)  maximizes  the  bracketed  term  for  each  r0.  Making 
the  necessary  comparisons  for  a  sequence  of  a -values  leads  to  the  policy 

j'  =  1,  if  /-o  <  0.2698 

=  2,  if  0.2698  <  r0  <  0.7083 

=  3.  if  0.7083  r0 

for  which  |£„|  <  0.003  and«*(l)  =  2.25. 

CASE  3  p  =  y 

We  know  that  2.205  <  a*|yj  <  2.25.  A  pilot  calculation  along  the  lines  indicated  in  the 

previous  section  shows  that  rojyj  =  0,  r‘  y  =  oo  for  j  ^  2,  and 

•  =  9(q  »  -  2) 
r|  8(3 -a*)’ 

where  a  *  is  chosen  to  make  the  following  £0(a)  vanish. 

£„(«)-  6.4  -3«+/0  /.  jl-y 

4  /h,+  o^l 

+  y  (3  -  «)r,  ~ — -  /„(2.981 

The  known  bounds  on  the  optimal  reward  « *| -y  J  imply  that  the  optimal  threshold  r*  is 
bounded,  too:  0.290  <  r\  yj  <  0.375. 

Similar  study  of  other  values  of  the  correlation  parameter  p  lead  to  the  optimal  policy  pat¬ 
tern  described  in  Table  II.  One  might  say  that  as  p  increases,  the  past  observations  are  more 
informative,  the  optimal  policy  makes  finer  distinctions,  and  the  optimal  reward  increases. 

5.  CONCLUSIONS 


A  multivariate  exponential  distribution  has  been  used  to  describe  successive  stages  of 
deterioration.  Optimal  replacement  strategies  have  been  found  for  the  class  of  decision  rules 
which  can  continuously  observe  the  deterioration  state,  and  which  may  make  replacements  only 
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TABLE  2  —  Optimal  Policy  Structure 


at  the  times  of  state  transitions.  Similar  results  have  been  found  for  the  other  reward  rates 
shown  in  Figure  2  (linear;  and  constant  after  an  initial  set-up  interval  for  readjustment  to  the 
new  state)  [5]. 

The  optimal  replacement  policy  derived  in  Section  2  makes  use  of  observations  which 
allow  estimation  of  the  current  rate  of  deterioration  for  the  correlated  stages  of  deterioration. 
The  numerical  example  demonstrates  how  the  optimal  policy  and  reward  are  related  to  the 
amount  of  correlation  between  the  durations  in  successive  deterioration  states.  For  the  model 
used  here,  the  optimal  policy  for  p  =0  will  achieve  the  same  reward  (less  than  optimal)  for 
any  p.  Depending  on  the  application,  the  suboptimal  approach  may  be  satisfactory.  The  addi¬ 
tional  reward  achievable  by  the  actual  optimal  policy  is  bounded  by  the  easily  computed  optimal 
reward  for  p  =  I .  However,  it  is  possible  that  the  small  percentage  improvement  achievable 
for  the  p  =  1/2  case  in  the  example  could  represent  a  significant  gain  in  a  particular  applica¬ 
tion. 

The  ordering  of  state  dependent  rewards,  mean  durations,  etc.  assumed  here  are  physi¬ 
cally  reasonable,  and  lead  to  nice  ordering  of  the  decision  regions.  However,  other 
(i,,  r),,  p,.  d,  orderings  might  be  more  appropriate  in  other  situations.  The  model  introduced 
here  for  dependent  stage  durations  could  be  used  in  those  cases,  together  with  dynamic  pro¬ 
gramming  optimization,  although  the  solutions  may  not  have  comparably  neat  structures 

We  anticipate  that  the  optimization  approach  and  policy  structure  described  here  will  also 
be  applicable  to  replacement  problems  having  similar  deterioration  models.  One  easy  extension 
would  be  to  change  the  correlation  structure  in  (A-3)  from  p1'-'1  to  something  else,  e.g., 
P '  ' 1  -bp)'"'1.  Other  changes  could  permit  the  r,  to  have  non-exponential  distributions,  as 
long  as  similar  total-positivity  properties  exist  to  permit  analogous  simplifications  in  the 
dynamic  programming  arguments. 

Some  of  these  other  r,  distributions  are  being  studied  now  in  the  hope  of  finding  similar 
models  which  exhibit  large  percentage  ddierences  between  the  optimal  rewards  as  pr  ,itl  changes 
from  zero  to  one.  (Other  choices  of  the  numerical  values  in  Table  I  have  not  revealed  any 
such  cases  for  the  current  model). 

One  reasonable  generalization  would  allow  transitions  from  state  /  to  any  slate  j  >  i.  This 
would  not  change  the  form  of  the  solution  in  the  case  of  constant  replacement  penalties.  How- 
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ever,  the  possibility  of  these  additional  transitions  does  ruin  the  structure  when  replacement 
penalties  increase  with  the  deterioration  state.  (The  6„_2(s)  >  Sn_  t  (s )  argument  is  no  longer 
valid.) 
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APPENDIX 

Dependence  Relationships  Among  Multivariate 
Exponential  Variables 

Many  multivariate  distributions  have  been  described  and  applied  to  reliability  problems 
[4,8,10).  In  each  case  the  marginal  univariate  distributions  are  of  the  negative  exponential 
form.  Properties  of  the  distribution  used  here  are  most  easily  derived  by  exploiting  its  relation¬ 
ship  to  multivariate  normal  distributions  (3.5). 

The  multivariate  exponential  variables  ru  r2 ,  rn  can  be  viewed  as  sums  of  squares: 
(A-l)  r,  ~w,2  +  z,2. 

where  w  and  r  are  independent,  zero  mean,  identically  distributed  normal  vectors,  each  with 
covariance  matrix  T.  It  follows  that  the  r,  have  exponential  marginal  distributions  with 

(A-2)  E[r, |  -  2y„ 

Pr/j  ”  [Ph',1^1  • 

We  specialize  to  the  case  where  the  underlying  normal  sequences  [*;)  and  [z,l  are  Markovian 

(A-3)  y,j  ~  y/VPrij  PU~jl 

and  find  that  [r,l  is  also  Markov  with  the  joint  density 
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(A-4) 


/(f0>rl>r2>  ■••  • 


u-p)-‘finir 

1-0  1 


n-2 

n  7o 

/-  0 


•exp 


— ? —  rn 

1  -  P  V  ViVi+1 


frin+1 


1 


'•q  [  r, ,-1  !  ^r,(l-t-p) 


•-PM70  y„-\  ft\ 


;  n  >  2, 


Equation  (A-4)  uses  (he  modified  Bessel  function  /0(  )  and  the  notations  E[r ,1  —  tj,  and 
p,  ,  +  -  p.  (When  n-2,  the  summation  in  exp  (  )  vanishes.) 


The  conditional  density  is  easily  shown  to  satisfy  the  Markov  property  and  [5] 

(A-5)  /(f/k_|)  -  It),G  -  p)l_lexpf-  — 

2 

1  -  p 

with 

(A -6)  ftr.lr,.)!  -  t},  +  (r^,  -  r>j_,)p 

(A-7)  Varlrjr,-,]  -  t}/[(1  -  p)2  +  2p(  1  -  p)r,_ |/tj,_i1. 

These  conditional  moments  shows,  e.g.,  that  the  conditional  mean  of  r,  exceeds  its  mean  in 
proportion  to  the  amount  by  which  r,_i  exceeds  its  mean,  and  that  conditional  mean  is  a 
linearly  increasing  function  of  r,_|. 


The  dynamic  programming  arguments  used  here  required  calculations  of  conditional 
expectations  based  on  (A-5).  As  is  often  the  case  |2],  the  total  positivity  properties  of 
/(r,  I r,_  | )  are  very  useful  for  determining  structural  properties  of  the  optimal  policy. 


It  is  straightforward  to  show  that  both  fir,.  r,_|)  and  f(r,\r^\)  are  totally  positive  of  all 
orders  (TP„),  [5,71.  This  means,  for  f(r,,  r,_|),  that  the  following  determinants  are  nonnega¬ 
tive  for  any  Wand  any  ai  <  a2  ...  <  ajv;  fi\  <  P2  ■  ••  <  Pn- 


1/  (at , ,/S|)  /(a|  •  P2)  .  .  .  f(ahpN) 


>  0. 


I/"  Cat  /v .  /3 1 ) 


f  (otN.fi  ti) 


THEOREM:  if  h(y)  is  continuous  and  convex,  and  satisfies  the  bounds 


(i)  h( 0X0 

(ii)  |/t()')|  <  a  +  b  y2m\  a  >  0,  b  >  0,  y  >  0.  m  —  positive  integer.  g(x)  — 
J  h(y)  f(y\x)  dy ,  and  /(y|x)  is  TP „,  then  g(x )  is  continuous,  convex,  bounded  in  the  sense 

|gGr)|  <  a'  +  b'  x2m\  a'  >  0,  b‘  >  0,  x  >  0; 
and  belongs  to  one  of  the  three  following  categories: 
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(I)  g(x)  ^  0  for  all  x  >  0, 

(II)  g(x)  <  0  for  all  x  >  0  except  with  a  possible  zero  at  x  —  0, 

(III)  there  exists  a  unique  x\  0  <  x*  <  <»,  such  that  g(x)  >  0  for  all  x  >  x* ;  and 
£(x)  <  0  for  x  <  x*  except  for  a  possible  zero  at  x  —  0. 

This  theorem  is  used  to  define  optimal  decision  regions  according  to  the  sign  of  a  function  like 
g(x),  with  x*  corresponding  to  a  decision  threshold. 
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ABSTRACT 

In  this  paper,  a  statistical  analytic  model  for  evaluation  of  the  performance 
of  a  standard  electric  bomb  fuze  timer  is  presented.  The  model  is  based  on 
what  is  called  a  selective  design  assembly,  where  one  item,  namely,  a  resistor, 
is  used  to  time  the  circuit.  In  such  an  assembly,  the  remaining  components  are 
chosen  a  priori  from  predetermined  distributions.  Based  on  the  analysis,  a  gen¬ 
eral  numerical  integration  scheme  is  utilized  for  assessing  performance  of  the 
timer.  The  results  of  a  computer  simulation  are  also  given,  in  the  last  section 
of  the  paper,  a  theory  for  evaluation  of  the  yield  of  two  or  more  timers 
designed  to  operate  in  sequence  is  derived.  To  appraise  such  a  scheme,  a  nu¬ 
merical  quadrature  routine  is  developed. 


1.  INTRODUCTION  AND  PHYSICAL  DESCRIPTION 

In  this  paper,  we  shall  be  concerned  with  the  statistical  analysis  of  the  bomb  fuze  timer 
shown  in  Figure  1.  As  is  common  in  practice,  a  standard,  or  precision,  resistor  is  used  to  time 
the  circuit  after  the  rest  of  the  components  have  been  assembled  in  a  random  fashion.  Then, 
to  meet  certain  timing  requirements  to  be  discussed  later,  a  resistor  is  selected  and  introduced 
into  the  circuit.  A  number  of  tests  must  afterwards  be  performed  in  sequence  to  check  the  per¬ 
formance  of  the  product  under  differing  environmental  conditions.  Such  environmental 
influences  are,  for  example,  temperature  effects,  effect  of  packaging,  resistor  incrementation  (to 
be  discussed),  and  effect  of  vibration  and  moisture  uptake.  In  addition,  one  might  have  several 
timers  which  operate  sequentially,  all  fed  from  the  same  energy  storage  capacitor  Cl  of  Figure 

1.  This  paper  is  devoted  to  an  analysis  of  such  a  timer  in  what  is  called  the  ambient  tempera¬ 
ture  range,  whose  limits  are  70°F  and  80°F,  respectively.  We  will  also  indicate  the  procedure 
for  treating  analytically  the  assessment  of  performance  of  combinations  of  several  timers.  The 
author  has  been  involved  in  a  Monte  Carlo  study  for  the  Navy  of  such  timers.  Previous  work 
has  involved  reliability  studies  of  an  entire  fuze  assembly  using  these  timers  [2]. 

2.  RESISTOR  SELECTION  PROCESS 

The  timer  indicated  in  Figure  1  works  once  the  potential  difference  across  the  two  capaci¬ 
tors  C 2  and  C3  is  sufficient  to  fire  the  cold  cathode  diode  tube  VT.  Capacitors  Cl  and  C3  ini¬ 
tially  have  the  same  potential  across  them.  As  time  progresses.  Cl  discharges  through  resistor 
RES  into  C2,  while  C3  serves  as  a  reference  capacitor.  Thus,  the  voltage  across  C2  builds  up 
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Figure  I  Fuze  timer  configuration 


until  the  potential  across  tubes  C2  and  C 3  is  adequate  to  fire  tube  VT.  The  relationship 
between  firing  time  and  the  values  of  the  circuit  components  can  be  derived  from  a  simple  first 
order  differential  equation  and  is  given  by 

on  ,  RC ^  .  I _ Y£\ _ 

“  c,  +  C2  n  [  VC,  -  (VT  -  V)  (C,  +  C2)  ’ 

where 

C i  -  capacitance  of  capacitor  C 1 
Ci  —  capacitance  of  capacitor  Cl 

V  —  supply  voltage  (potential  across  C 3  and  potential  initially  across  Cl) 

VT  —  firing  voltage  of  cold  cathode  diode  tube  VT 
R  —  resistance  of  resistor  RES, 

To  illustrate  the  pertinent  features  of  the  process,  write  (2.1),  for  brevity,  in  the  form 
(2.2)  t  -  RF(C|,  C2,  V.  VT). 

Note  that  (2.2)  is  linear  and  homogeneous  in  R ,  so  that  R  can  be  used  as  a  scaling  parameter. 
This  is  precisely  how  it  is  used  when  the  timer  is  first  assembled. 

In  practice,  the  resistors  are  supplied  in  large  numbers  by  the  manufacturer,  after  which 
they  are  tested  and  sorted  by  the  user  into  a  large  number  of  bins.  The  resistors  in  each  bin 
have  resistances,  at  a  standard  temperature,  which  fall  into  a  certain  interval.  These  intervals 
are  arranged  to  have  the  same  "percent  width",  to  be  described  in  more  detail  below.  The  timer 
is  to  be  designed  to  fire  at  a  nominal  time  iN.  Since  capacitors  CI  and  Cl  are  chosen  at  ran¬ 
dom  from  a  lot,  their  capacitances  C,  and  C2  may  be  treated  as  random  variables.  Likewise, 
tube  firing  voltage  VT  may  also  be  considered  as  a  random  variable.  In  general,  we  shall  also 
consider  the  supply  voltage  V  to  be  random. 
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Let  us  agree  to  denote  by  R0  that  value  of  R  obtained  from  relation  (2.1)  when  t  -  ty 
and  C2,  V,  and  VT  are  given  their  expected  values  at  some  standard  temperature,  e.g., 
75°F.  For  convenience,  Rn  may  be  used  as  a  reference  resistance,  and  the  bin  to  which  refer¬ 
ence  resistor  RES0,  of  resistance  R0,  belongs  could  be  called  the  reference  resistor  bin.  The 
interval  corresponding  to  this  bin  is  to  contain  all  resistances  which  fall  between  R0  (1  -  *)  and 
fl0(l  +  «),  where  e  is  a  preassigned  small  positive  number.  Our  second  bin  will  contain  all 
resistors  whose  resistances  fall  between  R0(  1  +  t)  and  /?o(l  +  «)J/(1  —  «),  and  the  third  bin 
those  resistors  whose  resistances  lie  between  R0(  1  -  «)2/(l  +  t)  and  R0(l  -  «).  In  general, 
our  intervals  are  to  be  so  constructed  that  the  ratio  of  right  endpoint  to  left  endpoint  is  always 
(1  -+- « )/ ( 1  -  «),  which,  to  first  order  accuracy,  is  just  1  +  2«.  Alternatively,  one  may  divide 
the  difference  of  the  two  endpoints  by  its  midpoint  to  obtain  precisely  2c.  We  shall,  therefore, 
say  that  each  such  interval  has  "percent  width"  2c.  In  setting  up  the  interval  division  scheme,  a 
percent  increment  c,  is  chosen  a  priori,  and  then  e  -  e,/I00.  This  ct  is  typically  of  the  order  of 
1/2  to  1%.  Figure  2  is  a  diagram  of  this  scheme. 

.  • -  1  •  . ♦ - •  •  -  '  •  • 

Rq(I— €)2  R0(l-g)  Ro0-€)  R0  R0(l+e)  R0(l+g)  RpO+e)2 

l  +  €  l+€  I— €  I  — e 

1-KiURL  2  Rcsisliince  inlerval  setup 

Once  again  referring  to  our  circuit  configuration,  where  C2,  V,  and  VT  are  random 
variables,  let  us  define 

(2.3)  r0=  *0F(C,.  Ch  V.  VT). 

Then,  to  achieve  the  nominal  time  rv.  we  define  our  nominal  resistance  to  be 

(2.4)  *.v-/W/i>. 

Note  that,  since  ia  is  a  random  variable  (being  a  function  of  the  random  variables  C|,  C2,  K 
and  VT ),  /?v  is  also  a  random  variable.  A  technician  may  use  relation  (2.4)  to  determine  R». 
Then  he  picks  a  resistor  RES,,  at  random  from  the  bin  to  which  resistor  RESV  belongs  and 
integrates  such  resistor,  of  resistance  Rp ,  into  the  circuit.  This  process  is  called,  in  fuze  tech¬ 
nology  parlance,  "resistor  incrementation."  Note  that  Rp  is  a  random  variable  which  is  statisti¬ 
cally  dependent  on  Rs  inasmuch  as  Rn  and  RN  must  lie  in  the  same  interval.  However,  once 
attention  is  restricted  to  a  given  interval  of  the  scheme,  it  is  clear  that  the  value  of  Ry  in  no 
way  influences  the  value  of  Rp ,  since  one  is  free  to  select  any  resistor  in  the  bin  to  which  the 
nominal  resistor  belongs.  We  shall  reemphasize  this  fact  in  Section  3.  For  simplicity  we  index 
the  intervals  by  /,  letting  their  left  and  right  endpoints  be  r,  and  ri+h  respectively.  To  achieve 
compatibility,  the  bins  should  initially  be  formed  and  kept  at  some  standard  temperature,  and 
the  timer  should  be  assembled  at  that  same  temperature.  In  practice,  this  will,  in  all  likelihood, 
not  be  the  case,  but  one  may  compensate  for  this  defect  by  studying  the  sensitivity  of  the  timer 
to  changes  in  bin  interval  width.  For  example,  if  by  doubling  the  interval  width,  the  overall 
change  in  performance  is  insignificant,  it  may  be  safely  assumed  that  such  a  discrepancy  was 
unimportant  (provided  the  distributions  due  to  ambient  temperature  variations  are  of  small 
variance). 

3.  PROBABILITY  INTERVALS  AT  THE  STANDARD  TEMPERATURE 

The  problem  of  determining  the  probability  of  operation  of  the  timer  within  two  given 
times,  say  t\  and  /2,  when  there  is  no  effect  other  than  resistor  selection  is  not  difficult.  (We 
also  ignore,  in  this  section,  the  effect  of  tube  firing  voltage  variation  from  one  firing  to  the 


378 


E.A  COHEN.  JR 


next.  This  phenomenon  will  be  discussed  in  some  detail  in  Section  4.)  The  reason  is  that  the 
time  is  linear  and  homogeneous  in  resistance  R.  In  fact,  the  bins  have  been  designed  to  take 
advantage  of  this  feature,  and  we  shall  show  that  the  probability  interval  is  independent  of  the 
bin  in  which  resistor  RES  V  falls. 

First  of  all,  let  /min  and  be  the  minimum  and  maximum  times,  respectively,  obtain¬ 
able  when  the  nominal  resistance  RN  and  the  picked  resistance  Rp  come  from  a  given  bin  i. 
Also,  let  F^n  and  be  the  smallest  and  largest  values  of  F,  respectively,  given  only  iy  and 
knowing  that  Ry  comes  from  that  bin.  It  follows  that 

(3.1)  'mit,  “  r,  F^’n  -  r,7.v/r,  +  | 

and 

(3.2)  /m«  “  r,+ ,  fma*  ”  '".  +  1  ty/ rt. 

Therefore,  given  that  Ry  and  Rp  lie  in  interval  /, 

(3.3)  r,tjr:+ ,  <  t  <  r1+,  ly/r,. 

Since  rjr,+ ,  =■>  (1  -  «)/( 1  +  «), 

(3.4)  (1  -  «)/(l  +  «)  r/f.v  <  0  +  € )/ ( 1  -  € ) , 
independent  of  bin  interval.  In  other  words,  (3.4)  is  true  with  probability  1. 

Generally,  suppose  that  one  is  interested  in  the  probability  that  firing  time  falls  between 
two  prescribed  limits  about  the  nominal  time.  Consider  once  more  a  given  bin  /.  Let  us  denote 
by  Rii')  and  Rpn  random  variables  derived  from  Ry  and  Rp  respectively  under  the  condition 
that  Ry  and.  therefore,  Rp  must  lie  in  interval  /.  From  our  discussion  in  section  2,  it  is  clear 
that  these  new  random  variables  must  be  independent.  Let  /,  and  i2  be  the  lower  and  upper 
limits,  respectively,  on  firing  time.  For  any  given  value  of  the  random  variable  Ry\  one  can 
determine  limits  on  the  random  variable  Rp']  so  that  the  firing  time  lies  between  /,  and  t2. 
Since,  by  definition,  ty  -  Ry'F,  it  follows  that  Rpn  cannot  be  less  than 

(3.5)  tj  F  =  /|/?.v'V/,v 
Similarly,  Rpl]  cannot  exceed 

(3.6)  t2Ry'*/  ty. 

One  must,  of  course,  realize  that  (3.5)  may  be  smaller  than  r,  and  (3.6)  larger  than  /•,+,  for 
values  of  Rtf*  close  to  r,  and  r(+1.  respectively. 

If  we  let  g(Ry)  be  the  density  function  of  the  random  variable  Ry  defined  by  (2.3)  and 
(2.4),  whose  range  is  a  function  of  the  domain  of  C|,  C2,  F,  and  VT.  then  the  induced  random 
variable  Rs'  has  conditional  density 

(3.7)  *•' W)  -  g(Ry)/P(r,  <  Ry  <  r,+1)  *  g(Ry)/  £'*'  g(Ry)dRy. 

The  range  of  Ry*  is  restricted  to  the  interval  \r,,  r/+1).  Using  the  mean  value  theorem  of 
integral  calculus,  (3.7)  becomes 

(3.8)  gu'(R#')  -  * (/?*)/* (f)(r,+  1  -  r,),  rl+\. 

If  r,+ 1  -  r,  is  sufficiently  small,  one  sees  that 

(3.9)  j?1' W)  -  !/(/■,+,  -  r,). 
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Similarly,  let  /*')(/?pt'))  be  the  density  function  for  picked  resistance  Rp‘\  whose  range  is  like¬ 
wise  restricted  to  [r,,  r,+1l.  Then,  with  the  knowledge  that  and  Rp'  are  independent  ran¬ 
dom  variables,  and,  letting  Pt(t\  <  t  <  t2)  be  the  probability  that  firing  time  falls  between  t| 
and  t2  (given  that  RN  and  Rp  come  from  interval  i ), 


(3.10)  /»,(/,  <  t  ^  t2)  -  C'  f 2  "  's  g^iR^f^R^dR^dRi,' 

ri  l\Rj) 


We  take  the  liberty  of  defining  /(')(/?p('))  -  0  in  (3.10)  whenever  Rp(,)  €  tr,,  r,+)].  This  is  done 
purely  for  the  sake  of  convenience  of  notation  even  though  the  range  of  R^  is  (r,,  r,+1]. 


The  probability  that  the  time  falls  between  t\  and  t2  is  expressed  by 
(3.11)  />(/,  <  t  <  t2)  -  £  p,P,(t,  <  t  <  i2), 

— oo 

where  p,  is  the  probability  of  choosing  bin  i. 


As  we  have  previously  indicated,  if  rt+i  —  ri  is  sufficiently  small,  we  can  assume,  for  all 
practical  purposes,  that  Ry ’  is  a  uniformly  distributed  random  variable.  The  picked  resistance 
Rp‘]  should  also  be  a  uniformly  distributed  random  variable  if  all  resistors  in  bin  i  are  equally 
likely  to  appear.  In  other  words,  let  us  assume  that 

(3.12)  g(n(R^)  -  /'',(/?,,,',)  “  l/(r<+1  -  r,). 

Suppose  then  that  one  asks  for  the  probability  that  r \  -  /jv(l  -  8)  <  /  <  tyi  1  +  8)  -  f2  for  a 
given  small  8.  We  proceed  to  derive  closed  form  expressions  for  this  probability.  Three  cases 
naturally  arise,  the  first  of  which  is  shown  in  Figure  3  below.  For  brevity,  we  shall  drop  the 
superscript  i  in  this  figure  and  the  two  following  figures.  In  this  diagram,  the  interior  of  the 
quadrilateral  formed  by  the  lines  Ry  -  r„  Ry  =»  rl+t,  Rr  -  t,Ry/ty,  and  Rr  -  t2R.y/ty  is  the 
region  of  integration.  Note  that,  in  the  two  hatched  regions,  /^(R^)  -  0,  since  then  either 
Rp  <  r,  or  Rn  >  r,+1.  After  a  small  computation,  one  sees  that  the  inequality  rj0’  <  r,v”  is 
equivalent  to 

(3.13)  0  <  8  <  t. 


We  also  note  that,  using  (3.12),  (3.10)  represents  the  normalized  area  of  the  interior  of  the 
hexagon  shown  in  Figure  3,  bounded  by  the  lines  Ry  —  r,,  Rs  —  r,+),  Rp  t\Ry/ty, 
Rp  =  t2RN/ty,  Rr  =  r,,  and  Rp  =  r,+l.  Therefore, 

(3.14)  J»(r.v(l  -8)  <  t  <  ry(\  +8)) 


1 


(r,+t  -  r,V 


dRu'  dRi, 


/.r  /(H-IU  r(l+(U«; 

+  |  f  dRU)  dRi,1 

+  f’*' ,S  "  ()dR(l)  dRk'  1 

(1  +  e)J(2  +  8)  (1  —  «)2(2  —  8) 


6 

8«2 


1  +  8 


1  -  8 


0  <  8  <  «. 


It  follows  that  P,  is  independent  of  i.  From  (3.11), 


(3.15)  Pit i  <S  t  ^  t2)  -  P,(f,  <  t  t2). 
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Figure  3.  Picked  resisiance  versus  nominal  resistance  (Region  1) 


The  second  case  occurs  when  r,  <  ^  rv0>  <  /■<+).  This  situation  is  indicated  in  Fig¬ 

ure  4.  One  can  also  show  that  r,v0)  = /-,+l  when  8  =  2e/(l  +  e)  and  that  z^11  =  r,  when 
8  -  2«/(l  -  «).  Therefore,  the  situation  illustrated  in  Figure  4  occurs  when 
(  <  8  <  2«/(l  +  e).  A  third  case  will  occur  when  2e/(l  +  t)  <  8  <  2e/(l  —  e),  as  illustrated 
in  Figure  5,  where  the  dotted  region  is  now  a  pentagon.  For  8  >  2e/(l  -  e),  the  dotted  region 
becomes  the  interior  of  a  rectangle  completely  enclosed  in  the  sector,  so  that  the  probability 
becomes  unity.  In  the  third  case,  one  sees  that  r,  <  rv11  <  r,  +  l  £  ri0>.  When  one  integrates 
over  the  interior  of  the  quadrilateral  outlined  in  Figure  4,  one  again  obtains  the  closed  form 
given  in  (3.14).  Therefore,  (3.14)  is  valid  whenever  0  <  8  <  26/(1  +  e).  The  case  illustrated 
in  Figure  5  is  different.  When  we  integrate  over  the  interior  of  the  pentagon,  which  is  that  por¬ 
tion  of  the  region  of  integration  for  which  the  integrand  of  (3.10)  is  nonzero,  we  find  that 

(3.16)  F,(M1  -  8)  <  r  <  Ml  +  8))  -  *(1  V-~T  —  —  • 

8e2(l  +  8) 


1  -I-  6 


<  8  < 


1  -  6 


One  easily  shows  that  (3.16)  becomes  unity  when  8  =  2e/(l  -  c)  is  substituted. 


4.  PROBABILITY  INTERVALS  AT  AMBIENT  TEMPERATURE  BEFORE  POTTING 


The  analysis  of  the  timer  when  temperature  and  cold  cathode  diode  firing  voltage  varia¬ 
tions  are  considered  is  different  from  that  of  the  previous  section,  since  all  components  except 
for  the  resistor  enter  the  time  nonlinearly.  It  would  then  be  necessary,  at  least  in  principle,  to 
take  into  consideration  the  probabilities  p,  of  picking  the  bins  as  well  as  the  probabilities  for 
picked  resistance  once  a  bin  has  been  selected.  However,  if  the  variations  due  to  these  effects 
are  relatively  small,  one  should  again  see  probabilities  essentially  independent  of  the  bin 
selected.  Furthermore,  in  a  situation  like  this  wherein  certain  distributions  are  quite  tight,  i.e., 
are  of  small  variance,  some  simplifying  assumptions  can  be  made.  We  shall  get  to  these 
presently.  Again,  as  before,  we  assume  that  the  bin  intervals  are  so  small  that  we  may  reason¬ 
ably  suppose  that  (3.12)  is  true.  Note  also  that  (2.3)  and  (2.4)  express  R*  in  terms  of  f,v,  Ct, 
C2,  V,  and  VT.  Assume  now  that  C i,  C2,  K  and  VT  are  independent,  normally  distributed  ran¬ 
dom  variables.  Suppose,  as  is  common  in  practice  when  coefficients  of  variation  are 


382 


R.A.  COHEN.  JR 


Here  the  subscript  E  indicates  evaluation  at  expected  values  and  var  represents  the  variance 
operator.  Now,  clearly. 


(4.3) 


bR\ _ [n_  bF  .  . 

bC(  ”  F1  bC, '  '  “  ’ 


with  similar  expressions  for  bRN/bV  and  bR\/bVr ■  The  relevant  partial  derivatives  of  F  are 
given  by 


(4.4) 


and 


bF 

ac, 

bF 

bCj 


C,  +  C2  C,  +  C 


1 


X  - 


Vt-  V 


Cy  -I-  C2 

bF  c  \  c  2 

bVT~  Y 

bF  C,C2  vr 
bV  = 


C  i  +  c2 


X  + 


Y 

c2(vT-  V) 


VY 


X  =  In 


YCy 


vcy  -  (vT-  mct  +  C2) 
Y-  VCX-(VT-  V)  (C,  +  C2). 


Now  p,  represents  the  probability  of  choosing  bin  /,  and  that  is  precisely  the  probability  that  the 
random  variable  Rv  belongs  to  bin  i.  Furthermore,  because  we  are  now  assuming  that  /?;V  is  a 
linear  function  of  the  independent,  normal  random  variables  Cb  C2,  K  and  VT>  Rs  is  likewise 
normal.  Therefore,  letting  £  -  E{R >•)  and  ar 2  **  var (R/v),  one  has 


(4.5) 


P,  - 


ryflir 


r 


</r 


1 

V2w 


</v. 


where  v,  »  (r ,  -  f  )/<t  and  v2  -  (r,+|  -  £)/»•,  so  that  p,  may  be  readily  calculated  from  tables. 


Supposing  that  the  picked  resistor  and  the  other  components  are  subject  to  a  temperature 
change  from  the  standard  temperature,  we  must  compute  the  effect  of  such  a  change,  together 
with  the  resistor  incrementation  effect  of  Section  3,  in  order  to  obtain  the  probability  of  satisfy¬ 
ing  the  specification.  It  will  be  assumed  in  our  analysis  that  the  ambient  temperature  is  a  uni¬ 
formly  distributed  random  variable  whose  range  is  given  by  T\  <  T  <  T2.  If 
P(t i  <  i  <  t2\T)  is  the  probability  of  meeting  the  time  limits  for  a  given  temperature  T,  then, 
clearly, 

(4.6)  /»(/,«;  i  £i2) -//>(/,<  i^i2\T)p(,T)dT^-~-^r  PU,  <  t  <  r2| T)dT. 

* '  i  /  2  —  / 1  '  i 


Let  us  give  an  example  of  the  computation  of  the  nominal  resistor  distribution.  Suppos¬ 
ing  in  (4.1)  that  rv  ~  2.6  seconds,  C |  f  -  .44  /«f,  C2E  ~  .15^f,  VE  —  177v..  and  yr,e  ”  235v„ 
one  finds  that  £(ff  v)  ”  40.16  megohms.  Also,  one  finds  from  (4.3)  and  similar  expressions, 
upon  inserting  expected  values,  that  bRy/bCy  -  8.65,  bRy/bC2“  -  293.12,  bR^/b  Vm  -  126, 
and  bRy/bYr"  ~  0.95.  Let  us  assume  the  following  standard  deviations:  <r(Ct)  -  0.0073, 
<t(C2)  -  0.0025,  <r(F)  -  0.17,  and  <r(Kf)  -  4.17,  where  Vf  is  used  to  denote  the  expected 
breakdown  voltage  of  a  diode  chosen  from  a  lot.  The  expected  values  of  the  breakdown  vol¬ 
tages  of  all  the  tubes  are  themselves  assumed  to  follow  a  normal  distribution  with  expected 
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value  235v.  and  with  the  above  a-.  In  addition,  each  tube  has  a  firing  voltage  which  varies 
about  its  expected  value.  This  new  random  variable,  with  expected  value  0,  we  denote  by  A  VE, 
and  it  is  assumed  that  A  VE  is  also  normally  distributed.  The  random  variable  Fr,  which 
represents  the  firing  voltage  of  a  tube  selected  from  a  lot,  is  actually  formed  as  a  sum 
VT  =  Vf  +  A  VE,  where  we  shall  suppose  that  AF£  is  independent  of  Vf.  Also,  tests  per¬ 
formed  by  fuze  specialists  indicate  that  the  random  variables  A  VE  have  the  same  distribution 
from  one  tube  to  the  next.  Assuming  that  <r(AFf)  =  0.24,  it  follows  that  <r(VT)  -  4.17. 
Then,  from  (4.2),  var  Ry  —  16.3235,  or  <r(Ry)  =  4.04.  Therefore,  the  coefficient  of"  variation 
is  0.10,  which  is  reasonably  small. 

We  now  develop  a  general  method  for  evaluating  the  performance  of  the  timer  which  is 
based  on  a  linear  theory.  Hopefully,  this  theory  will  yield  at  least  conservative  estimates.  Our 
formula  is  a  generalization  of  that  given  in  paragraph  3.  First  of  all,  from  (2.3)  and  (2.4),  it 
follows  that 

(4.7)  Ry  -  ty/F(Ct.  C2.  V.  Vf"). 

where  Vfl)  =  Vf  +  A  Vj>u.  Therefore,  solving  (4.7)  for  Vf,  where  F(C\,  C2,  V,  Vfl))  is  given 
through  (2.1)  and  (2.2),  one  finds  that 

(4.8)  Vf-  VC,/(C,  +  C2)  -  (A F£n  -  V)  -  VC ,  f''v/Rvfl«/(C|  +  C2). 

Here  Cefr  =  1/C|  +  1/C2  is  the  effective  series  capacitance  of  C)  and  C2,  and  AF£n  denotes 
that  variation  in  tube  firing  voltage  from  its  expected  value  which  is  associated  with  determina¬ 
tion  of  the  nominal  resistance  Ry.  For  brevity,  we  let  C2,  V,  /?,v,  A^11)  represent  the 

right  hand  side  of  (4.8).  There  is,  however,  a  second  variation,  which  we  shall  denote  by 
A  vp\  that  occurs  once  a  resistor  has  been  selected  from  a  bin  and  the  timer  actually  operated. 
These  two  variations  must  be  taken  into  account  carefully  when  assessing  timer  performance. 
One  may  now  make  a  1-1  transformation  from  the  space  of  (C|,  C2,  V,  AK^11,  i^21,  Vf)  to 
that  of  (C|,  C2,  V.  Aft",  AF£2’,  Ry)  through  the  map 

(4.9)  C,  =  r„  C2  =  C2.  V  =  V,  A  V£"  =  A  VE\  A  V£2>  =  A  VE2\ 

Vf  =  Jf (C|.  C2.  V,  Ry,  Alt' 

whose  Jacobian  is  dVE/dR\.  It  follows  [3,  pp.  56-62)  that  the  density  function  for  the  stale 
(C,,  C2,  V,  A  Vp\  A  Ve2\  Ry)  is 

(4.10)  /(C„  C2,  V,  A  VE  \  Aff,  Ry)  -  P|(C1)/72(C2)/>,(F)p4(A^n) 

•  ps(Ar;:i)p,I*(f|.  C2.  V,  Ry,  A  Ff11] 

■  Id  Ff/dUyl, 

where  />,(C,)  (/  =  1,2)  are  the  densities  for  C,,  py  is  the  density  for  V ,  pt  the  density  for 
AF^12,  p5  the  density  for  A  V£2\  and  pb  the  density  for  Vf.  These  random  variables  are  all 
assumed  to  be  independent.  In  addition,  Af"1  and  A  VE2'  are  identically  distributed.  Next 
account  must  be  taken  of  the  fact  that,  because  of  a  change  in  temperature,  the  capacitances  C, 
will  change  in  value.  In  fact,  we  assume  that  C,(D,  where  T  denotes  temperature,  is  of  the 
form 

(4.11)  C,(T)  -  C,(l  +  Kt(T-  Te)/  100), 

where  K,  represents  a  random  percent  change  per  degree  from  the  expected  temperature  TE. 
Thus  C,(D  is  a  product  convolution  (3,  pp.  56-62]  of  C,  and  the  second  factor,  which  we 
denote  by  ACP,(D  (representing  a  percentage  change  in  C,  due  to  a  temperature  change  from 
expected  value  TE  to  T).  We  then  form  the  joint  density  6(C|,  C2,  ACfifD,  \CP2(T),  V , 
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A  V^\  A  y£2\  R\)  from  /and  the  densities  for  these  percent  changes.  Afterwards,  h  is  multi¬ 
plied  by  p{Rp(T)),  the  convolution  density  of  picked  resistance  at  temperature,  where 

(4.12)  R„(T)  -  *„(1  +  C(T  -  Te)/100 ) 


and  C  is  a  random  percent  change  per  degree.  Finally,  if  we  are  interested  in  the  conditional 
density  for  any  given  bin  /,  we  must  divide  by  ph  the  probability  of  choice  of  bin  /.  It  is  clear 
that,  in  order  for  the  time  output  of  the  timer  to  fall  between  two  chosen  values  /,  and  t2, 
RP(T)  must  lie  between 

r,/F(C,(n.  C2(T).  V,  VP) 

and 

tjFiCitn.  c2m,  v,  vp). 

where  V}2)  =  Vf  +  A  VE2)  with  Vf  given  by  (4.8).  Also,  from  (4.11), 


(4.13)  C,(T)  =  Cj&CPj(T). 

Now  let  XT  *=  (C |,  C2,  ACPi(T),  hCP2(T),  V,  AK^",  Aff21).  There  follows  the  general 
multiple  integration  formula,  which  expresses  the  probability  P,  that  the  time  falls  between  r, 
and  t2  for  bin  /  and  conditioned  on  temperature  T: 


(4 


.14)  p,PM  <  r  <  /2ln  -  t'  L  ,  S'/'  piRr(T))h(Xr.RN)dRp(T)dX, 

*/ri  *  Xj<  /?  '(-oo.oo)  •rt\fr  * 


dR> 


where  R2(-°°,  °°)  represents  the  seven-fold  Cartesian  product  of  the  real  line.  Finally, 

(4.15)  FO,  <  t  <  t2)  -  —  1  —  £,  p,  f!2  P,Ui  <  t  <  t2\T)dT, 

■2  '1  -oo  1 

given  that  the  temperature  distribution  is  uniform.  This  integration  procedure  could  be  accom¬ 
plished  on  a  digital  computer  through  use  of  numerical  Gaussian  quadrature  and  Gauss- 
Hermite  quadrature  [5,  pp.  130-132],  However,  instead  of  using  this  general  nonlinear 

approach,  we  find  it  convenient,  in  the  present  context,  to  linearize  the  products  given  by 

(4.11)  and  (4.12)  and  to  make  use  of  a  linearized  version  of  RN  given  by 

(4.16)  RN  =  Rn(Cx,e,  C2  E,  Ve,  Vt}e)  +  d](C,  —  C]  E)  +  A2(C2  —  C2  E) 

+  A}{V-  Ve)  +  A<(Vt-  K$>. 

where,  of  course, 

_  .  _  dF,v  .  dRy  dR\ 

1 "  ac,  ’  2~  dc2  ’  }=  dK  ’  4  =  dvr 

are  evaluated  at  the  expected  values  for  the  components  and  V^'E  represents  the  expected  value 

of  random  variable  V}".  (4.11)  now  becomes 

(4.17)  C,(D  -  C/E(K,  -  K, E) (T  —  TE)/m  +  C,(  1  +  KU(T  -  7»/100), 
where  K,  E  represents  the  expected  value  of  A,,  and  (4.12)  becomes 

(4.18)  RP(T)  -  (1  +  Ce(T-  TE)/m]Rn  +  Rc(C  -  CE)(T  -  F£)/100, 

where  Rc  is  the  center  of  the  bin  considered.  Note  that  the  effect  of  (4.17)  and  (4.18)  is  to 
replace  product  convolutions  by  convolutions  of  sums  of  random  variables  when  it  comes  to 
computing  densities.  Also,  supposing  that  -  r^(l  -  8)  and  t2  -  r^(l  +  8),  the  limits  on  the 
innermost  integral  of  (4.14)  become  r/F-  (1  -  8)r,v/F and  t-JF  —  (1  +8)r^/F,  respectively. 
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The  functional  form  iNl  F  is  to  be  replaced  by  the  linearized  version  (4.16)  with  C,(F),  C2(T), 
and  K/2’  substituted  for  Ct,  C2,  and  VT ,  respectively.  We  have,  therefore,  after  a  small  com¬ 
putation, 

(4.19)  t ,/F=  (l  -  8)[/(*  +  /4,AC,(r)  ■+■  A2AC2(T)  +  AK{AV^  -A^")  1 
and,  likewise, 

(4.20)  tj/F  -  (1  +  8)[/?v  +  ^,AC,(r)  +  /l2AC2(r)  +  /44(AKP  -  A^")l, 

where  AC,(D  =  C,(D  —  C,.  When  C|,  C2,  K,  and  Kr  are  independent,  normally  distributed 
random  variables,  the  analysis  is  a  bit  simpler,  since  it  is  easily  seen  that,  in  this  case,  the  pair 
(K,v.  A  ft0)  is  bivariate  normal  (3,  pg.  162],  In  addition,  one  notes  that  (4.19)  and  (4.20)  do 
not  depend  on  C j,  C2,  and  K  in  the  linear  analysis.  In  Section  6,  we  present  a  numerical 
example  following  this  procedure.  It  may  be  noted,  by  analogy  with  the  development  in  para¬ 
graph  3,  that  the  condition  tjF  <  R(T)  <  t2/ F  is  equivalent  to  requiring  that  R(T)  lie 
between  two  hyperplanes  in  the  six-dimensional  (RN,  AC),  AC2,  A  A  Vp\  R(D)  space. 

5.  PROBABILITY  INTERVALS  AT  AMBIENT  TEMPERATURE  AFTER  POTTING 

When  the  timer  is  actually  packaged,  or  potted,  this  procedure  will  produce  statistical 
changes  in  the  component  values.  These  changes  are  known  in  the  trade  as  potting  shifts. 
Such  shifts  can  be  taken  into  account  by  convolutions  of  the  densities  previously  determined 
with  those  densities  evolving  from  the  operation  of  potting.  This  has  an  effect  on  such  items  as 
the  picked  resistor,  the  capacitors,  and  the  voltage  regulator.  Generally,  potting  shifts  are 
represented  as  percentage  changes  from  previous  values,  and,  therefore,  strictly  speaking,  we 
have  another  product  convolution  to  consider.  For  example,  we  represent  the  value  of  resis¬ 
tance  due  to  temperature  and  potting  by 

(5.1)  Rfo,iT)  -  Rr(T)(\  +CHG,/100). 

where  the  subscript  pot  denotes  potting  and  CHG|  represents  a  random  per  cent  change  from 
the  value  of  picked  resistance  at  temperature,  if  we  linearize  R^iT)  about  nominal  values,  we 
find  that 

(5.2)  KP„,m  =  (I  +  CHG,  f/100)/?(D  +  KiTTMCHG,  -  CHG,  f)/100. 

where  R2:(T)  is  the  expected  value  of  picked  resistance  at  temperature  for  the  given  bin  and 
CHGi  j-  is  the  expected  value  of  CHG).  From  (4.18),  this  is  given,  to  a  first  approximation,  by 

(5.3)  Rf  (T)  -  (I  +  Cy(T  —  TN)/m\Rt, 

where,  as  before,  is  the  center  of  the  bin  interval.  As  for  the  capacitances,  we  assume  a 
form 

(5.4)  C,  ,„„(  D  =  C,  ( D  ( 1  +  CHG2/100), 

so  that  we  would  linearize  Cipot(T)  about  nominal  values  in  a  manner  analogous  to  that  for 
*poI(r).  Lastly,  the  voltage  regulator  value  after  potting  is  representable  by 

(5.5)  Fp<„  -  V  +  CHG,. 

Hence,  we  need  only  go  back  through  our  analysis  with  RP(T)  replaced  by  R^iT),  C,(T) 
replaced  by  C, .p,„(r),  and  V  replaced  by  Vm.  It  is  assumed  that  Fr,  the  cold  cathode  diode 
tube  firing  voltage,  is  unaffected  by  potting.  One  more  integration,  corresponding  to  CHG,,  is 
introduced  in  order  to  take  account  of  the  change  in  regulator  voltage  due  to  potting. 
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6.  NUMERICAL  RESULTS 

Using  a  CDC  6600  computer,  we  were  able  to  develop  a  computer  code  which  can  be 
used  to  predict  efficiently  the  performance  of  the  timer  under  the  linearity  assumptions  outlined 
in  the  two  previous  paragraphs.  The  integration  scheme  developed  will,  in  this  paragraph,  be 
discussed  in  some  detail.  A  listing  of  the  computer  code  used  can  be  provided  on  request. 

First  of  all,  in  (4.18),  we  assume  that  Rp  has  a  uniform  distribution  across  the  bin  which 
is  being  considered  and  that  C  is  normally  distributed.  Let  us  suppose,  as  an  example,  that 
CE  -  -0.0235,  Te  -  75°F,  and  or(C)  =  0.0078.  Then,  of  course,  from  (4.18), 

(6.1)  RP(T)  -  fl  -  0.0235(r  -  75)/100]/?<>  +  Rc(C  +  0.0235)(r  -  75)/100. 


Therefore.  RP(T)  is  a  sum  of  two  independent  random  variables,  one  of  which  is  uniform  and 
the  other  of  which  is  normal  and  of  mean  0.  It  follows  that 


(6.2) 


p(S,(n>  -  ,/3J(0  000078)|j._  75|*t(,„,  -  OH  -  0.000235(7  -  75)1 


/ 


<l-0  00023S<r-75»r 
<l-0  000235<7'-75))r, 


R/ri-u 


+  1  2  I  3?  (0.000078)  (7"- 75)  I  . 

e  1  r  1  du. 


Letting 

v-  (u  -  R/J(7’))/R,(0.000078)|7'-  75|. 
(6.2)  is  converted  into 


(6.3) 


p(R"{T))  =  -  rf )|l  —0.000235(7—75)) 


f 


_IV2 


d\, 


where 

(6.4)  v,  -  [(1  -  0.000235(7- -  75))r,  -  «^ ( 7*) ]//?t (0  000078)  I  T  -  75| 
and 

(6.5)  v2  =  [(1  -  0.000235(7-  -  75))r,+  l  -  /?„( r))//?,  (0.000078) | r  -  75|. 

Several  cases  now  arise  according  to  the  value  of  RP(T)  and  according  to  whether  or  not 
T  ^  75°F.  We  first  consider  the  case  when  T  ^  75°.  Let  us  develop  an  inequality  which 
allows  us  to  assert  that  V|  <  -  3.  In  fact,  suppose  that 


(6.6)  R„(T)  -  r,>  *,  r, (0.000235) (f  -  75), 

where  k\  is  to  be  so  determined  that  v,  <  -  3  is  valid.  Upon  substituting  (6.6)  into  (6.4),  one 
has 


(6.7)  v,  <  —  U,  +  1 ) r, (0.000235)/ R,  (0.000078) . 

Remembering  that  rJR,  -  I  -  t,  we  find  that  -  */(l  -  «)  will  yield  the  requisite  inequality. 
Next  let  us  obtain  an  inequality  which  will  permit  us  to  say  that  v2  >  3.  Suppose  that 

(6.8)  r,+,  -  RP(T)  >  r,+l (0.000235) (7-—  75). 

Then,  from  (6.5),  we  have 

(6.9)  v2  >  3(*j-  D(1  +«). 

The  right  side  of  (6.9)  equals  3  when 

k2  -  (2  +  «)/(!  +  «). 
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Thus,  if,  for  T  >  75, 


(6.10)  /•,(«)  -  r,(l  +  (0.000235)(r-  75))  <  R(T) 

1  —  € 

<  r,+1(l  -  yyy  (0.000235MT  -  75))  -  r,+l(t), 
it  follows  from  (6.3)  that 


(6.11) 


p(Rp(T)) 


_ 1 _ 

(rf+,  -  r,)[l  -  0.000235(7’  —  75)]  ' 


Now  suppose  that  T  <  75.  Letting  Rp(T)  —  r,  >  k3  r, (0.000235)  (T  —  75),  it  follows  that 
(6.12)  v,  <  Hk}  +  1)(1  -  t). 


The  right  side  equals  —  3  when  k}  —  —  (2  —  e)/(l  —  «).  Again,  assuming  that  r,+)  — 
RP(T)  >  fc4  r,+1  (0.000235) (7  -  75),  we  have 

(6.13)  v2  ^  -  3(1  +  «)(*4-  1), 

which  equals  3  when  fc4  —  «/( 1  +  e).  Therefore,  when  T  <  IS  and 

(6.14)  /■,'(«)  -  r, ( 1  -  y=y  (0.000235)(r  -  75))  <  R(T) 


<  r,  +  , ( 1  -  y~y  (0.000235) (T  —  75))  -  r/+1(e), 

(6.11)  is  again  satisfied.  Next  let  us  go  back  to  the  case  when  T  >  75,  but  let  us  now  require 
that  v2  <  -  3.  We  find  that  such  is  true  when 


(6.15)  R„(T)  >  r,+ 1  -  yyy  r1+t(0.000235)(r-  75). 

Since  v2  <  -  3  also  implies  that  vt  <  -  3,  we  can  assume  that  p(R(T))  -  0  in  this  case. 
Likewise,  one  finds  that  vt  >  3  whenever 

(6.16)  Rp(T)  <  r,  -  y^y  r, (0.000235)  ( T  -  75), 

so  that,  in  this  range,  p(Rp(T))  -  0,  also.  When  T  <  75,  v2  ^  -  3  whenever 

(6.17)  Rp(T)  2  r.+i  -  yyyr,+I(0.000235)(r-  75), 
and  v,  >  3  when 

(6.18)  RP(T)  <  r,  +  y~y  r,  (0.000235)  (T  -  75). 

Again  it  follows  that  p(Rp(T))  -  0.  Now  there  remain  certain  intervals  in  which  p(.R„(,T )) 
cannot  be  treated  as  constant  for  a  given  temperature.  For  example,  it  is  found  that,  when 
T  >  75  and 

(6.19)  />+,(«)  -  r,+  \  (1  -  yyy  (.000235)(T  -  75)) 

^  RP{T)  <  rl+l(l  -  yyy(.000235)(T-  75))  -  s,+1(c), 

-  3  <  v2  <  3  while  v,  <  -  3.  Also,  in  the  interval 


Mi 
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(6.20)  s,(«)  -  r,(  1  -  (.000235) (T  -  75))  <£  RP(T) 

<  />(1  +  (.000235)(r-  75))  -  #>(«). 

-  3  <  v,  <  3  while  v2  >  3.  When  T  <  75,  p(R„(T))  cannot  be  treated  as  constant  whenever 

(6.21)  s/(«)  -  #>,(  1  +  (.000235)(T  -  75))  *,(r>  <  /•’(*) 

or 

(6.22)  r/+1(c)  <  RP(T)  <  ri+t  (1  -  (-000235 )(T  -  75))  -  s,'+1(«)- 

The  intervals  so  developed,  in  which  the  behavior  of  p(Rp(T))  is  examined,  are  very 
important  in  the  numerical  study  conducted  on  the  CDC  6600.  We  now  set  up  the  precise  pro¬ 
cedure  used  in  the  computer  study,  first  of  all,  referring  to  (4.19)  and  (4.20),  we  find  it  a  little 
more  natural  to  integrate  with  respect  to  AC,(D  or  AC2(D  first  instead  of  RP(T).  We  see 
then  that  our  region  of  integration  is  fully  specified  by 

(6.23)  /i(4Cj,  Rpm,  R».  A  V?\  A  V^)  <  AC,  <  /2(AC2.  R„(T).  RN,  A  V?\  A  V^) 

-oo  <  A  C2  <  +«> 

— oo  <  Rp(T)  <+oo 
—  OO  <  A  Kj-1*  <  OO 
-oo  <  A  <  oo 
r,  <  R\  <  ri+) 

T,  <  T  <  r2, 
where,  for  A ,  >  0, 

(6.24)  /,  -  -7-  44t'  -  /12AC2-  -  ^4(A^2)  -  A^») 

4|  I  T  O 

/2“77  44t-^c2-^-.4(a^-a^>) 

and  the  inclusion  of  negative  values  for  RP(T )  is  merely  a  mathematical  artifice.  The  density 
function  for  this  process  then  has  the  following  form: 

(6.25)  MAC,,  AC2,  Rn,  R„(T),  A^'\  AKjP) 

-  p,(AC,)p2(AC2)p3(f?),(r))p4(/?>,  Afi")MAKiJ')/p((r2-  r,). 

The  densities  p ,,  p2 ,  and  ps  are  all  normal  densities.  The  mass  function  p}  was  ascertained  in 
(6.3).  pt  is  a  bivariate  normal  density,  and  p,  is  the  probability  of  being  in  bin  i.  It  is  easy  to 
determine  the  correlation  coefficient  p  for  p4.  Multiplying  A  by  RN>  as  given  by  (4.16),  we 
have 

(6.26)  E(Rn A  Ki")  -  Aa  E( A  V£")2 

-  /<4oJ(AKil)), 

and,  since  the  expected  value  of  A  K^l>  is  zero,  co v(RN,  A  V£u)  -  E(RN AK^n).  It  follows, 
using  (6.26),  that 

(6.27) 


p  -  /44«t(A^")/<t(/1>). 
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The  factors  in  (6.25),  other  than  p3,  are  given  by 


Pi(AC,) 


1 

1 

AC,  -  £(AC,)  2 

(2ir)l/2o-(AC,)  P 

2 

<r(AC,) 

1 

1 

AC2  -  £(AC2)  |2 

(2jt)i/2ct(AC2) 

2 

it(AC2)  | 

p4(*v.  A^u) 


1 


2jto-(/?v)o,(A  V£l))y/ 1  —  p2 
•  exp 


-1 

Rn-E(Rn)  2 

A^1"  -  £(AKi'>)  j 

2(1  -p2) 

<r(/?\) 

<r(A  Kf1’)  ) 

-2  p 


/?/v  —  f  (Z?^) 

A  kf"  -  £(A^") 

<t(RN) 

<r(A^u) 

ps(\  V?))  = 


(2ir)l/2<r(A^2)) 


exp 


j_  A  V£2)  -  f  ( A  Kj2>) 
2  cr(A  ff2)) 


where  p  is  given  by  (6.27)  and  p4  is  the  well-known  joint  normal  density  for  two  variates  [7, 
pp.  111-114). 


Now  let  K,  f  =  .04  for  i  —  1,2  in  (4.17)  and  <r(Kj)  —  .013,  i  =  1,2.  Recall  from  our  dis¬ 
cussion  in  paragraph  IV  that  C|t=-.44/i/,  C2i-  —  ,15p/,  E(RN)  —  40.16  megohms, 
-4 ,  -=  8.65,  -4 2  —  293. 12,  44--0.95,  and  a(Rs)  -  4.04.  In  addition,  suppose  that 

£(A  ^")  -  f  (A  ^2>)  =  0  and  <r(A  K^")  -  <r(A  Vp])  -  0.2357.  Then  it  is  seen  that 

£(AC,)  *  0.000176(7  —  75),  <r(AC,)  -  0.000058 74 1 T- 7S|. 

£(AC2)  =  0.00006(T-75).  andcr(AC2)  -  0.00002034| T- 75|. 


Next  we  make  several  changes  of  variable.  Let 
(6.28)  u  =  (AC,  -  £(AC,))/V2<r(AC,) 

w  -  (AC2  -  £(AC2))/V2<r(AC2) 
z  =  v/\f2 

u,  =  («v  -  £(/?*))/V2(l  -  p2)o-(/?,v) 
A  ^"/V2(l  -  p2)<r(A  ^l() 
w2-  A^2Vs/2o-(A^2)). 


Then  (6.25)  becomes 
(6.29)  /i2 


Vi  -  p2 


•T- 


w3(r(+I  -  r,)[l  +  Q(T-75)/100) 

■e-wi-e-i‘h2"'w'+wi'lPliT2-  r,). 

Now  one  finds,  by  completing  the  square,  that 

(6  30)  ^-(W|2-2r.«lwl  +  i»|2)  _  ^-<h,-p«i)2-(I-p2)«i2 

Next  we  let  w3  -  w,  -  p«,  and  u2  -  Vl  -  p2  u ,.  Our  integrand  becomes 


dz 


i 


•  k*  ’ 
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1 


nHri+l  -  f,)Il  +  Cn(T  -  75)/ 100] 


e  “2  ■  e 


e-l/MT,-  r,). 


For  brevity,  set  Y  -  (tv,  w2,  w3),  and  let  denote  the  usual  three-dimensional 

Euclidean  space.  Also,  put  u2i  -  (/•,  -  E(.RN))/J 2  a-(.RN)  and  m2./+i  ”  (r)+,  -  E(,Rs))/y/2 
ct(Rh).  Then  our  integration  scheme  becomes 


(6.32) 

where 


P,(tNd  -«)  ^  t  ^  f\(l+8))“,4|  +  /42, 

r,  dm>dT\ Xy n„,„ 

+  I;u>  Jf,(k.Uj.«)  h{u.Y.R.u2)dudR  + /,.  |(e)  JF)(  h(u.Y.R.u2)dudR 


dudR 


and  42  is  obtained  by  using  75  and  f2  for  limits  on  the  T  integration  in  place  of  Tx  and  75, 
respectively,  with  primed  quantities  replaced  by  unprimed  quantities.  In  addition,  we  have  set 


(6.34)  F,0',u2./n«F1(w,u2,H-2,wj./?)-l/1(AC2,/?,AKi2',A^",/?.v)-£-(AC1)]/N/2<T(AC1) 
F2(  Y,u2,R  )**  F2(w,u2,w2,w2,R  )« [/2(AC2,/?,  A  Ff2*,  A  1/^1>,/?jv)—£(ACi)]/-\/2o’(ACi). 


Now  /,  and  /2  were  defined  in  (6.24),  and,  from  the  changes  of  variable  given  by  (6.28),  we 
have 


(6.35) 


AC2=  £(AC2)  +  V2w<r(AC2) 

A^!i  -  V2cr(A^2))tv2 

A  =“  V2(l  —  p2)<r(A  Fjrn)(iv3  +  pu2/~J  1  —  p2) 

/?v  =*  E { R \)  +  >/2o-(/?iV)w2. 

Our  computer  code  is  just  the  implementation  of  a  nesting  procedure,  making  use  of  Gaussian 
and  Hermite-Gaussian  quadrature  routines,  together  with  routines  to  evaluate  the  error  integral 
15,  pp.  130-132],  [6,  pp.  319-330],  [8],  (1,  pg.  924],  It  turned  out  to  be  convenient  and  numer¬ 
ically  accurate  and  timewise  efficient  to  employ  three  Gauss  points  per  integration  step. 


The  effect  of  cold  cathode  diode  firing  voltage  variations  in  this  problem  is  more 
significant  than  that  of  ambient  temperature  departures  from  nominal.  In  our  case  study,  for 
example,  when  e  —  .01  and  6  =  .02,  P ,  was  essentially  91%.  With  8  —  .03,  this  figure  was 
increased  to  almost  100%.  Results  for  six  bins  with  e  -  .01  and  8  —  .03  are  given  in  Table  1. 


TABLE  l  —  Performance  of  Fuze  Timer 
for  Representative  Bins 


p, 

r, 

r,+ 1 

Rc 

.994848 

37.4435 

38.2000 

37.8218 

.995103 

38.2000 

38.9717 

38.5858 

.995221 

38.9717 

39.7590 

39.3653 

.995414 

39.7590 

40.5622 

40.1606 

.995452 

40.5622 

41.3817 

40.9719 

.995490 

41.3817 

42.2176 

41.7997 

TMTii m* i 
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It  is  seen  that  the  probability  is  essentially  the  same  independent  of  the  bin.  Running  time  for 
this  problem  was  approximately  four  seconds  per  bin.  indeed  one  would  reason,  as  in  para¬ 
graph  3,  that,  at  least  approximately,  each  bin  should  yield  the  same  probability  for  firing  time, 
given  a  8  —  «  combination.  This  should  occur  if  the  nonlinearities  are  not  too  severe  and  the 
distributions  due  to  change  in  temperature  and  cold  cathode  diode  firing  voltage  variations  are 
fairly  compact.  This  would  then  mean  that  we  need  only  examine  one  bin  to  determine  the 
performance  of  the  timer,  and  our  integration  procedure  could  then  represent  a  substantial  time 
saving  over  a  Monte  Carlo  simulation. 


Going  back  to  (6.32),  we  can  also  give  an  error  bound  for  the  part  neglected  in  the  com¬ 
putation  of  Pj.  Let  us  illustrate  in  one  case  what  is  happening.  For  instance,  we  have  neglected 

(6.36)  f  J  J  ,  ,J  ,  h  (u.Y.R.u2)dudYdRdu2dT 

Clearly,  (6.36)  is  bounded  above  by 

(6.37)  f/2  P  '  f  4  C  th(R.Z,u2)dRdZdu2dT. 

Jli  J"2,  Jzt  /?  4<  —  OO  .  OO  t  J 


where  Z  =  («,  K).  Noting  that  h(u,w,R,w2,u2‘wi)  —  g(u,w,wi„u2,w2)p(R)  and  that 
fr2I2l'f  4  g(Z,u2)dZdu2dT  =  1, 

JT I  ,  J  /?4(-oo.  oo) 


We  need  only  study  the  behavior  of  the  integration  with  respect  to  R.  Going  back  to  (6.37), 
when  s/+i  <  R  <  00 ,  we  know  that  v,  <  v2  <  -  3.  Therefore,  it  is  easy  to  show  that 


(6.38)  p(Rp(T))  <  -^=  e~vi/2/Rc<r(C)\T-  751/100. 

It  follows  (2,  pg.  149]  that 

(6.39)  J"  p(R  ( T))dR(T)  <  f~  e~*'n  dx  a  .00135. 


A  similar  result  is  obtained  when  R  is  restricted  to  the  interval  (— «\  s,(e))  and  T  >  75°  or 
when  R  lies  in  either  (s/+|(€),°°)  or  (— «>,  s/(«))  and  T  <  75°.  The  result  is  finally  that  the 
portion  neglected  is  bounded  above  by  .0027,  so  that  we  are  at  most  off  in  the  third  decimal 
place. 

7.  THE  CASE  OF  TWO  OR  MORE  TIMERS 

An  interesting  case  study  arises  when  there  are  two  or  more  timers  which  are  statistically 
dependent.  This  occurs,  for  example,  when,  after  the  first  timer  is  operated,  a  switch  closes 
and  a  second  timer  is  started,  the  second  one  being  fed  by  the  same  capacitor  which  fed  the 
first  timer.  Let  us  suppose,  for  instance,  that  capacitor  Cl  in  Figure  1  feeds  the  second  fuze 
timer  indicated  in  Figure  6. 

At  the  end  of  operation  of  the  first  timer,  switch  S  in  Figure  6  is  thrown  into  the  position  indi¬ 
cated,  thus  allowing  Cl  to  begin  charging  up  C4.  C5  serves  as  the  reference  capacitor.  The 
second  timer  is  also  governed  by  a  simple  first  order  differential  equation,  and  one  can  show 
that  the  time  is  given  by 

_ C\V- c2(vr-  y) _ 

C,  V  —  c2(  vT  —  V)-  (VTA  -  F)(C,  +  C4)  ' 

Letting  be  the  nominal  time  for  the  second  timer,  we  find  the  nominal  resistance  for  this 
timer  to  be 


(7.1) 


/?,nC,C4 
C,  +  C4 


itn 
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(7.2) 


Rl" 


(c,  +  c4)tH,1 
c,c4 


in 


KC,  -  {V}n  -  K)C2 


VCX  -  (K/n  -  V)C2  -  (Vl'l  -  V)(CX  +  C4) 


where  —  k'f+A^"  and  {  —  f'f,  +  A  l^  V-  Then,  substituting  (4.8)  into  (7.2),  we 
derive  the  functional  relationship 


(7.3)  R&"  -  R$'  (C„  Cj,  C4,  K  K*.  A  f'i",  Kj'/ ). 

Solving  (7.3)  for  we  have 

(7.4)  yf  ,  -  /,(<?,.  C2.  C4,  K  «,v,  /?.4r,).  A^-".  A^.V). 

To  determine  the  joint  density  for  the  process,  we  must,  by  analogy  with  the  method  in  para¬ 
graph  4,  introduce  a  pair  of  diode  firing  variations  A  and  A  V£\.  We  then  consider  the  fol¬ 
lowing  transformation  of  variables. 


(7.5)  Kf ,  -  /r(C,.  C2,  C4,  V.  R„,  Rtf',  A  V?\  A  K&V ) 

C,  -  c, 
c2-  c2 
c4«  c4 

^  -  V 

R\  -  R\ 

A  Vl"  -  A  Kf11 

A^1,’  “  A^.V 

A  ^2’  -  A  V ? 

A  KjJ2,1  -  A^2,’. 

To  compute  the  density,  we  employ  (4.10)  and  the  Jacobian  of  the  transformation  (7.5)  to 
obtain 


AN  ALYSIS  OF  Fll/F  TIMER 


39  3 


(7.6)  </5(C,.  C2.  C4.  V.  Ry.  R"\  A  Ki'».  A  F^1,'.  A  F'P.  A  ^ ) 

-  /(C„  C2,  V.  A  A  F'P,  Ry)  •  </,(C4)  •  <f2(A  ^.V )  '  <*j(A  ^ ) 

\9Vh 


d,(VfA) 


dRl" 


Also,  if  both  /?v"  and  are  linearized  about  nominal  values  of  capacitance,  tube  firing  vol¬ 
tages.  and  regulator  voltage,  then  the  map 


(7.7)  «{■'*  -  L,(C„  C2.  C4>  F'f,  A  F^".  K  k£,.  AF^1,') 

#*v  -  L2(C C2,  K  F'f.  A  F^") 

A  F’/1,1  =*  A  F^1' 

A^n  =  A  F'f" 

shows  that  (/?vn,  Ry,  A  F^1,1,  A  F'/11)  is  a  quadrivariate  normal  random  vector  [2,  pg.  162).  The 
reason  is  that  all  random  variables  on  the  right  side  of  (7.7)  are  independent  and  normally  dis¬ 
tributed.  At  the  nominal  temperature,  the  density  function  is  therefore  generally  representable 
by 

(7.8)  rf6«r,.  C2.  C4.  V,  Rs.  Rtf',  A  Vl".  A  Vg\.  R„.  R"' .  A  F^2',  A  Vfl  ) 

-  «MC,.  C2.  C4.  K  «v.  /?v".  a  F^",  A  F'f1,'.  A  yp\  A  F^2,’ ) 

p{R„)  V"  (/?„"’), 

where,  for  example, 

/>(/?,,)  -  l/(r1  +  1  -  r ,) 

and 

/>"’  (*„'")  =  \Kr; "\  -r,<») 

if  picked  resistance  is  equally  likely  across  the  bins.  (7.8),  also,  obviously  indicates  that  picked 
resistances  are  statistically  independent  of  the  other  component  values.  It  will  be  possible  to 
reduce  (7.8)  to  the  simpler  form 


(7.9)  db(Ry,  /?,«",  A  Vl".  A  Vl'i.  R„,  R"\  A  V?\  A  Vg\ ) 

-  p(Ry,  Rj)".  IVl".  A^.V)  ■  p(Rp)  ■  p"\Rl")  •  p(A  V£2))  ■  p  (A  Vj£\ ) 

when  (7.7)  is  valid,  p(Ry,  Ry\  A  F^1’,  A  F^1,*)  being  the  density  for  the  quadrivariate  normal 
distribution  [9,  pg.  88).  From  (7.7)  the  elements  of  the  covariance  matrix  [9,  pg.  88]  can  be 
easily  obtained. 


Next  account  must  be  taken  of  changes  in  component  values  due  to  temperature  changes 
from  the  nominal  value.  We  use  the  same  ideas  presented  in  paragraph  4,  together  with  the 
same  notation.  The  density  becomes 

(7.10)  d(Cu  C2.  C4,  V,  Ry.  Rtf'.  A  Vl".  A  Vg\.  R„(T).  R"'{T).  A  F^2\  A  Vtf. 

A CP,iT).  A CP2(T).  ACP4(D) 

=  </5(C„  C2.  C4,  V.  Ry.  Rl",  A  F^",  A  Vl",  A  F'^2>,  A  F^2,’ ) 

•  /?(AC/’|(D)  •  p(\CP2(T))  ■  d^CP^T))  ■  p{Rr(T))  ■  p"'{R"'(T)). 


where  p(Rp(T ))  and  pu)(Rju(T))  are  again  convolution  densities.  We  must  now  determine 
limits  of  integration.  One  requires  that  the  first  timer  fire  in  time  i,  where  tt  -  r^O  -  8)  < 
t  <  tN(l  +  8)  —  r2  and  that  the  second  timer  fire  in  time  /<n,  where  r{n  -  f\n(l  -  8(n) 
^  rtn  ^  tvn(l  +  8(n)  —  r2tn.  Therefore,  we  have 

(7U) 

t\u / Fil)  <  RP{'HT)  ^  t^/F'" 

—  °°  <  C,  <  i  =  1,2,4 

—  oo  <  ACP,(D  <  «>,  /  =  1,2,4 

—  °°  <  A  <  OO 

—  OO  <  A  v£\  <  OO 

—  oo  <  A  <  oo 

—  oo  <  a  <  oo 

ri  ^  Rs  <  r,+\ 

r(l)  <  D  (1)  ^  ,(l> 

ri  ^  Ks  <  0+i 

—  oo  <  |/  <  oo , 

where  F  =  F(C,<n,  C,(r),  K  F^»),  Fm  =  F">(C,(r),  C2(D,  C4(D,  K  V?\  Vfl)  and 
C/2'  =  +  A  Vf2\  f/  i  =  f'f  i  +  A  as  before,  Ff  is  given  by  (4.8)  and  Ff  |  by  (7.5). 

Also,  (4.13)  holds  for  /  =  1,2,  and  4. 

An  integration  scheme  patterned  after  (4.14)  can  then  be  recorded  with  p,,  =  prob  (/?>  € 
bin  i  and  /?*’  €  bin  j)  in  place  of  p,.  (4.15)  would  then  be  replaced  by  a  double  sum: 

(7.12)  /Mr,  <  r  <  r2.  //"  <  r"1  <  rj") 

-  T-X-T  £  £  P„  Sr  '  P>St\  <  '  <  h.  r,(l)  <  r">  <  r2u' )| T)dT. 

Also,  in  the  case  where  (7.7)  is  valid,  C,  C2,  C4,  and  l7 are  eliminated  and  ACP^T)  is  to  be 
replaced  by  AC,(D.  In  addition,  r(,,//rin  and  t/F  become  linear  forms  in  AC^D,  AC2(D, 
AC4(D,  /?v.  /?vn,  Alj*11,  AF*!21,  A  F^1/,  and  A  F^2/.  In  that  case  a  sixteen-fold  integral  is 
reduced  to  a  twelve-fold  integral. 
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THE  ASYMPTOTIC  SUFFICIENCY  OF  SPARSE 
ORDER  STATISTICS  IN  TESTS  OF  FIT 
WITH  NUISANCE  PARAMETERS* 
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Ithaca.  New  York 


ABSTRACT 

In  an  earlier  paper,  ii  was  shown  that  for  the  problem  of  testing  that  a  sam¬ 
ple  comes  from  a  completely  specified  distribution,  a  relatively  small  number  of 
order  statistics  is  asymptotically  sufficient,  and  for  all  asymptotic  probability  cal¬ 
culations  the  joint  distribution  of  these  order  statistics  can  be  assumed  to  be 
normal  In  the  present  paper,  these  results  arc  extended  to  certain  cases  where 
the  problem  is  to  test  the  hypothesis  that  a  sample  comes  from  a  distribution 
which  is  a  member  of  a  specified  parametric  family  of  disiribulions.  with  the 
parameters  unspecified 


I.  INTRODUCTION 

V 

For  each  n,  the  random  variables  X\(n) . X„(n)  are  independent,  identically  distri¬ 

buted,  with  unknown  common  probability  density  function  and  cumulative  distribution  function 

/„(x),  F„(x)  respectively.  An  m-parameter  family  of  distributions,  with  pdf  /o(x;0 1 . 9  „) 

and  cdf  F0(x\9,,  ....  9m),  is  specified,  and  the  problem  is  to  test  the  hypothesis  that  /„(x)  = 
/o  (x;0, . 9m)  for  all  x,  for  some  unspecified  values  of  0| . 9 m. 

In  15]  the  simpler  problem  of  testing  the  hypothesis  that  f„(x)  —  /o(x),  where  /0(x)  is 
completely  specified,  was  discussed.  In  this  simpler  case,  the  familiar  probability  integral 
transformation  can  be  used  to  reduce  the  problem  to  that  of  testing  whether  a  sample  comes 
from  a  uniform  distribution  over  (0,1).  This  type  of  reduction  is  not  always  available  when  the 
hypothetical  density  is  not  completely  specified.  (See  [1]  for  some  cases  where  the  reduction  is 
available.) 


Since  we  will  be  interested  in  large  sample  theory,  to  keep  the  alternatives  challenging  we 
will  assume  that  /„(x)  =  /o(x;0|0,  ...,0”)  (1  +  r„(x))  for  some  unknown  values 
0|° . 9l  and  some  unknown  function  r„{x  )  satisfying  the  conditions  sup|/-„(x)|  <  n~f  and 


sup 


d'rn(x) 

dx' 


<  n  ‘  for  all  n  and  for  j  =  l, 2, 3,4,  where  e  is  a  fixed  value  in  the  open  interval 


I  i 

3'  2 
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The  case  where  m  =  2,  aiid  Ou  92  are  location  and  scale  parameters  respectively  is  rela¬ 
tively  simple  to  analyze,  and  occurs  often  in  practice,  so  until  Section  S  we  will  discuss  only  this 

|  ^ 

case.  That  is,  f0(x,Ol,02)  —  —  g  — - — -  with  02  >  0,  and  the  pdf  g(x)  is  completely 

02  02 

JX 

g(t)dr.  We  assume  that  sup  - g(x)  <  A,  <  «>  for  j  — 

v  dx1 

1,2, 3,4,  and  that  sup  g(x)  <  A2  <  °°. 

X 

For  each  n ,  we  choose  positive  quantities  p„,  q„,  and  L„  satisfying  the  following  condi- 


Pn  <  <?„  <  1  -  n  €. 

.  ,  „  _  n(q„-p„) 

np„.  nq„.  L„,  and  K„  =  - — - are  all  integers. 


lim  — —  =  1  for  some  fixed  8  in  the  open  interval  0,  —  —  —  . 

n^oo  L„  2  6 


lim  p„  =  0,  lim  qn  =  1,  lim  np„  =  00 . 

H—oo  n  — »oo  a  —00 


».«w|,(x):C-'|7^T|s*SO-'|Trj: 

positive  -y  with  y  -  €  +  28  +  5y  <  0. 


>  n  y  for  a  fixed 


,.  n2(  ,. 

lim  - —  oo,  lim  — jz - r 

n  — »oo  <i—oo  w  (1  —  qn) 


g(G~'(p„)) 
g  U) 

g(G~'(q„)) 

g(x) 


>  Aj  >  0  for  all  x  <  G  '(p„),  and 


>  A4  >  0  for  all  x  >  G  '( qn ). 


K,(n)  <  Y2(n)  <  <  Y„(n)  denote  the  ordered  values  of  A"i(w),  ....  X„(n).  For 

typographical  simplicity,  we  denote  T,(n)  by  Y,.  For  j  —  1 . K„,  let  Y,(n)  denote  — 

"f”  ^nfi„  an^  D,(n)  denote  (  Y„n^ +  ^  *  ( i  -  u  r„)  ■  For  J  “ 

1,  ...  ,  W„  -  1,  let  Iftlj.n) . #■"(£„-  1J.«)  denote  the  values  of  the  L„  -  1  variables 

among  [X^(n) . *„(«))  which  fall  in  the  open  interval  |?,(n)  -  ~y  T/(n)  + 

1,  written  in  random  order:  that  is,  the  same  order  in  which  the  corresponding  elements 

_ 

I V(i.j.n)  -  Y,(h) 

of  (T,(/i) . X„(n )!  are  written.  Define  W{iJ,n)  as  - n  ,  , - 


for  /' 
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1  <  ■■■  <  L„  1  and  j  =  1 . K„,  so  -  y  <  Wii.j.n)  ^  y.  Let  {£(/,»)  denote  the  (Z,„  - 

1) -dimensional  vector  [Wii.j.n) .  W(L„  -  lj./i))  for  j  -  ] . K„.  Let 

^(1.0,«) .  W(np„  -  1,0, n)  denote  the  values  of  the  np„  -  1  variables  among 

(Af|(«),  ....  X„in)}  which  fall  in  the  open  interval  (-«>,  Y„p)  written  in  random  order.  Let 

denote  the  vector  {K^d.O.n) . Winp„  -  1,0,//)).  Let  +  l.n) . 

W(n  -  nq„.K„  +  l.n)  denote  the  values  of  the  n  -  nq„  variables  among  (A'|(n) . A-,, ( /i ) } 

which  fall  in  the  open  interval  iY„Q  , °°),  written  in  random  order.  Let  WiK„  +  l,n)  denote 

the  vector  {  W(\,K„  +  l.n) .  Win  -  nq„,K„  +  l.n)}.  Let  Tin)  denote  the  (AT,,  +  1)- 

dimensional  vector  ( Y„p  +jI  i\  j  =  0, 1 . A",,} .  Note  that  if  we  are  given  the  K„  +  3  vectors 

defined,  we  can  compute  the  n  order  statistics  K, .  Y,„  so  that  any  test  procedure  based  on 

the  order  statistics  can  also  be  based  on  the  Kn  +  3  vectors. 

Let  h„ijin))  denote  the  joint  pdf  for  the  elements  of  the  vector  Tin),  and  let 
( h \ii,n)  Q(n))  denote  the  joint  conditional  pdf  for  the  elements  of  the  vector  fV(i.n)  given 

~~  K+2 

that  Tin)  =jin).  Then  the  joint  pdf  for  all  n  elements  of  all  the  vectors  is  h„ijin))  fj 
h'„ (_w (i,n)  [*(/»)),  which  we  denote  by 

Next  we  construct  two  different  "artificial"  joint  pdfs  for  the  n  elements  of  the  vectors. 

In  the  first  artificial  joint  pdf,  the  marginal  pdf  for  Tin)  and  the  conditional  pdfs  for 
fT(0,/i)  and  W.iK„  4-  l.n)  are  the  same  as  above.  The  pdfs  for  the  elements  of  the  other  vec¬ 
tors  are  constructed  as  follows. 


Let  a ,in)  denote  G 


1 

np„  + 

J  2 

L„ 

* ‘ 1 , j 

=  1. 

,  and  y,in)  denote  — - 


L„  g'ia ,(«)) 


,  for  j  **  1 , 


2/7  g2ia/in)) 

K„.  Let  Uii.j)  (/=1 . L„-  1 ;  y  —  1 . K„)  be  I1D  random  variables,  independent  of 

Tin),  WiO ,/;),  W  iK„  +  l.n),  and  each  with  a  uniform  distribution  over  (0,1).  Then  the  dis¬ 
tribution  of  Wii.j.n)  is  to  be  the  distribution  of  -  y  +  (1  +  y  ,in))  Uii.j)  -  y ,in)  U2ii,j),  for 


i  —  1 . L„  —  1  and  ./  =  1 . K„.  Denote  the  resulting  joint  pdf  for  all  n  elements  by 

h  (2) 

n„  . 


In  the  second  artificial  joint  distribution,  the  marginal  pdf  for  Tin)  and  the  conditional 

pdfs  for  Wil.n) . ( AT„ , n )  given  Tin)  are  the  same  as  in  h„a>.  Given  Tin),  the  np„  - 

1  elements  of  M^(0.  n )  are  distributed  as  IID  random  variables,  each  with  pdf 
giix-e?)/0$)/0$G  (iYnPn-8?)/0$)  for  x  <  Kfl/V  zero  if  *  >  Y^.  Given  T  in),  the  n-  nq„ 
elements  of  WiKn  +  l.n)  are  distributed  as  IID  random  variables,  each  with  pdf 
((l/02°)  g  iix  -  9?)/9$)/i  1  -  Gii  Ymn  -  0,°)/020))  for  x  >  Y„%,  zero  if  *  <  Y„%.  Denote  the 
resulting  joint  pdf  for  all  n  elements  by  hn<}>. 

If  S„  is  any  measurable  region  in  //-dimensional  space,  let  P/|(l,(5„)  denote  the  probability 
assigned  to  S„  by  the  pdf  //„('\  The  next  two  sections  are  devoted  to  proving  the  following: 

THEOREM  1:  lim  sup  | P,  (:tiS„)  -  P,  (S„)|  =  0. 

tl—oo  Sn  "  •>  "n 

THEOREM  2:  lim  sup  \P,  -  P,  m(S„)|  -  0. 

n  — 'w  V  ’a  f,n 
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2.  PROOF  OF  THEOREM  1 


Let  /i„(4)  denote  the  joint  pdf  which  differs  from  /i„<2)  only  in  that  yj(n)  is  replaced  by 


defined  as  - - -  ,  where  5,(n) 

2«  /2(a,(»)) 


*Pn  + 

1 

J  2 

L„ 

»  1 

It  was  shown  in 


(8)  that  lim  sup  |P .  (4i (S„)  —  P*  (»($,)!  •  0,  and  thus  Theorem  1  will  be  proved  if  we  can  show 

n— oo  Sn  nn  nn 

that  lim  sup  \Phw  ( Sn )  —  P^fS,,))  —  0.  By  the  reasoning  used  in  [8],  this  last  equality  will 


n— »  S, 

be  demonstrated  if  we  can  show  that 


,  h„m(T(n),  W(0,n),  ....  £(K„  +  1.«)) 

h}*HT(n),  W(Q,n) . \£(K„  +  \,n)) 

say,  converges  stochastically  to  zero  as  n  increases,  when  the  joint  pdf  is  actually  From  . 
the  definitions  above,  and  the  formula  in  [8],  for  all  sufficiently  large  n  we  can  write  R„  as 

log[l  +  y J(n)  —4y j(n)  W(i,j,n)) 

-  logll  +y}(n)  -4y,(n)  W(i,j,n)] 


(2.1) 


•  L-\ 

inn 

~I  I 

1  7-1  f-l 


where  W(iJ.n)  have  the  same  distribution  as  -  y  +  (1  +y  ,(n))U(iJ)  -  y  ,(n)  U2(ij)-  We 

show  that  the  expression  (2.1)  converges  stochastically  to  zero  as  n  increases  by  means  of  three 
lemmas.  (The  order  symbol  0( )  used  below  has  the  usual  interpretation.) 

LEMMA  2.1:  max  I y,(fl)l  0  (n  ^  *). 

'S'S*,, 


PROOF:  Directly  from  the  assumptions  and  the  definition  of  y,(rt). 

LEMMA  2.2:  sup  |  F~'  (?)  -  (0?  +  0$G~'U)}\  -  0 («-*+>). 

PnS'ZQn 

I  ^ _ ^0  | 

PROOF:  Since  f„(x )  —  — 3 -  g  — -3-M0  +  r„(x)),  with  (jc) I  <  rT‘,  we  have  Fn(x)  — 

I  —  0  to  _  _  x  )  i  t  —  0$  _ 

G| — ^0 —  +  where  Rn(x)  -  J ^  ~  g|-  ^  rn(t)dt ,  and  thus  |/?„(x)|  <  n~*G 

F„(x)  =  G  | — 0Q--j  (l  +  fl„(x)),  where  |/?„(x)|  <  n~‘  for  all 
. 2 .  _  ,  ,  fx-0,°) 


x-e° 


.  Then  we  can  write 


0  I  ■  >11511  wc  kmi  —  v/ - o  vit  a„w77,  wnere 

)  l  ) 

:.  Fix  any  value  t  in  the  closed  interval  t/j„,<7„).  Writing  F„(x)*=  t  “C 

f  _  i  /  x  * 


we  have  x  -  F„  '(?)  and  G 
(2.2)  G~' 

We  can  write  G  1  ^  - 


0? 


(1  +  *»), 


1  +  RJF-Ht)) 


1  +  n~‘ 
? 


0? 


s  <  0- 


,  SO 


I 


1  +  n" 


0? 


G'(f)  - 


in 


1  - 

1 


1  +  «"* 


where  ?*  is  in  the  open 
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interval 


1  + 


.  _ L 


l  +  «' 


that  sup 

P„  <  r  <  </„ 


\  1  -  n  ‘ 
using  the  inequalities  (2.2). 


,  and  thus 


1 


g(G  '0*» 


<  ny,  by  assumption  (1.5).  Then  sup 


-  G  1  (/) |  —  0(n  *+y).  By  a  completely  analogous  argument,  it  can  be  shown 
—  <7 * 1  (r)  |  —  0(h  ‘+v).  Then  the  lemma  follows  immediately. 


i 


LEMMA  2.3:  y,(w)  =  y  ,(n)  +  8,(«),  where  max  |S,(//)|  -  0(«  3 
PROOF:  By  lemma  2.2,  we  can  write  y  ,(n)  as 

,/;;(tfP+tf20“>(«) -*•»,(»)) 

In  +  {n)  +  «(„))  ’ 


1  V  ~  ^  ■ 

where  max  |8  (w)|  =  0(/T‘+y),  f„(x)  ^  —z  g - t— -  r'n  ( x )  +  (1  +  r„(x)) 

I  <  /  <  A'„  0)  02U 


X  -  0P 


0? 


(0j)2 


,  so  we  can  write  _/„(0|°  +  0?  <*/(»)  4-  8,(n))  as  -  ft  v  #'(«,(«))  +  8  *(«),  where 

W2 ) 

max  |  8  *( n )  |  =  0(n  <yy).  We  can  also  write  /„(0|°  +  0?a,(n)  +  8,(n))  as  -^r  g(a,(n ))  + 

'<'<  a„  0j 

8,  (n),  and  thus  /2(0|’  +  0?a,  (n)  +  8  (w))  as  — r  g2(a,(n ))  +  8 ,’(«),  where  max 

(02)2  i</<k„ 

|8,(n)|  —  0(n  ,yy)  and  max  |8,‘(n)|  «  0 (n~‘+y).  Thus  we  can  write  y,(n)  as  — 1 

i</<x„  2n 

|((l/(02)2)  g‘(a,(n))  +8  *(n))/((l/(02)J)  g2(a,(n))  +6 ,*(«))),  and  the  proof  of  the  lemma  fol¬ 
lows  directly  from  assumptions  (1.3)  and  (1.5). 

Now  we  complete  the  proof  of  Theorem  1  by  applying  the  expansion  log  (1  +  x)  —  a-  — 

v2  v3  y4 

rr-  +  - - 1 - —  for  I  .v  |  <  1,  where  M  <  1,  to  each  of  the  logarithms  in  the  expres- 

2  3  4(1  4-  w.v ) 

sion  (2.1).  This  enables  us  to  write  the  expression  (2.1)  as  the  sum  of  a  finite  number  of 
expressions,  each  of  which  can  easily  be  shown  to  converge  stochastically  to  zero  as  n  increases, 
using  the  lemmas.  For  example,  two  of  these  expressions  are: 


(2.3) 


(2.4) 


K„  /.„  I 

t£  [  ( y,2(")-y,2(" )).  and 

1  /-i  /-i 


21  I  (y ,■(«)  -  y j(n))W(i,j,n). 

i-\  /-i 


The  expression  (2.3)  is  the  sum  of  K„(L„  -  I)  terms,  where  K„(L„  -  1)  <  n.  A  typical  term 
can  be  written  as  (y ,-(«) - y,(n))  (y,(n)  +  y ,(n)),  which  by  Lemmas  2.1  and  2.3  is 

-  -#  +  2fi  +  5y  +  + 

0(n  ).  So  the  whole  expression  (2.3)  is  0(n3  )  and  converges  to  zero  as  n 

a,  <-n-\ 

increases,  by  assumption  (1.5).  The  expected  value  of  the  expression  (2.4)  is  2  £  £ 

/-i  i- 1 


(y,(»)  -y ,(«)) 


—  y,(n),  and  the  variance  of  the  expression  (2.4)  is  4  £  £ 
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(>/(w)-y,(n))2  -pr  +  -  ' 1 .  This  mean  and  variance  can  both  be  seen  to  converge  to  zero 

112  1 oU  I 

as  n  increases  by  the  same  reasoning  as  in  the  analysis  of  the  expression  (2.3),  and  thus  the 
expression  (2.4)  converges  stochastically  to  zero  as  n  increases.  The  other  expressions  in  the 
sum  comprising  the  expression  (2.1)  can  be  handled  similarly,  completing  the  proof  of 
Theorem  1. 


3.  PROOF  OF  THEOREM  2 

x  —  9  P 

In  Section  2  we  showed  that  we  can  write  F„(x)  -  G  - jj —  (1  +  R„(x))  where 

|R„(x)l  <  n~*  for  all  x.  We  now  develop  an  analogous  expression  for  1  -  F„(x).  1  -  F„(x) 


|rt„U)|  <  n  for  all  x  We  now  develop  an  analogous  expression  lor  I  -  r„(x). 

( x  —  9  \  r  °°  i  r  —  ®  ) 

-  X  /.(')*  -  1  -  c  -jp  +  X  r,M  Ti  S  A 

I  r“  1  f  /  —  0? I  x  —  0?  II 

|jv  we  can  write  1  _ 


and  since 


,  we  can  write  1  -  F„(x) 


(1  +  S„(x))  where  |S„(.v)|  <  n~‘  for  all  x 


Theorem  2  will  be  proved  if  we  can  show  that 

h,j"(T(n),  £((),») .  W(K„  +U))  = 

108  h^(Tin).  H^(O.rt) .  W{K„+\.n))  ~ 

say,  converges  stochastically  to  zero  as  n  increases,  when  the  joint  pdf  is  actually  /?nU).  Assum¬ 
ing  is  the  joint  pdf,  the  conditional  (given  Tin))  distribution  of  /?*  is  the  same  as  the  dis- 

tribution  of  Q„i  1)  +  Q„( 2),  where  Q„i\)  =  £  log(l  +  r„(V,))  -  inp„  -  l)log  (1  + 

i-i 

n  -  nqtt 

R„(Y„Pn)),  and  Q„i 2)  =  £  iogd  +  rn{Z,))  -  in  -  ngjlogd  +  S„( K,^)),  and 

_  /n(v) 

Vx .  t,  Z| .  ^n-nq„  are  titutually  independent,  each  ^  with  pdf  "  ■- -  for  v 

/*  (z ) 

<  Y„p  ,  zero  for  v  >  Y„n  ,  each  Z,  with  pdf  '  ”  '  j  for  z  zero  for  2  < 

LEMMA  3.1:  Q„( l)  converges  stochastically  to  zero  as  n  increases. 


PROOF:  Define  (?„(1)  as  £  M*7,)  ~  (np„-\)R„i  Y„n).  By  assumption  1.6, 

—  >-t 

l0„(l)  —  0„(l)  I  converges  stochastically  to  zero  as  n  increases.  Thus  the  lemma  will  be  proved 
if  we  show  that  Q„(l)  converges  stochastically  to  zero  as  n  increases. 


ElrJVMTin))  - 


n  jo  *  (-^p)  ° + r"(,))^ 


(1  +  R„(K„„)) 


-m-- 


fc*  • 


--  ---  • 
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G0 

Ynp-e? 

R„iY„„)  +  Z„n~2‘G 

#2° 

02° 

G0 

Y»p-°? 

#2° 

(1  +  RJY„PJ) 

where  |a/J  <  1.  From  this,  it  follows  that  I E[r„iVl)\Tin))  -  R„ ( Y„p) |  -  0 n(n  2t).  This 
implies  that  £{(?„(1)|  Tin)}  converges  to  zero  as  n  increases,  and  also  that  Variance 
{r„(  F,)l  ?(«))  =  0n(n~2t)  which  in  turn  implies  that  Variance  J0„(1)  I  Tin)}  converges  sto¬ 
chastically  to  zero  as  n  increases.  These  facts  clearly  imply  that  £)„  ( 1 )  converges  stochastically 
to  zero  as  n  increases. 


LEMMA  3.2:  (?„(2)  converges  stochastically  to  zero  as  n  increases. 


n — nqtl 


PROOF:  Define  Q„i 2)  as  £  r„(Z,)  -  (n  -  nq„)S„(Y„Q  ).  Just  as  in  Lemma  3.1,  all  we 

i-!_ 

have  Jo  do  is  to  prove  that  Q„i 2)  converges  stochastically  to  zero  as  n  increases. 
E\rSZ,)\T(n)\  = 


y 


L 

>?  *  o? 


(1  +  r„(t))dt 


1  -G 


Ym, ,-«? 


[1+5„(T„V  )] 


S  (Y  ) 

•^n  '  *  na  ' 


l -g 


-oj> 


+  to  „n 


1-G 


»  2° 


1-G 


Y„«,r9? 


«2° 


[i  +  s„( 


where  |at „ |  <  1.  From  this,  it  follows  that  |£tr„(Z,)|T(n)|  -  S„iY„p  )|  —  0n(n  J<).  The  rest  of 
the  proof  is  similar  to  the  proof  of  Lemma  3.1. 

Lemmas  3.1  and  3.2  imply  that  R‘  converges  stochastically  to  zero  as  n  increases,  and  this 
proves  Theorem  2. 

4,  CONSEQUENCES  OF  THE  THEOREMS 

Theorem  1  implies  that  a  statistician  who  knows  only  the  vectors  Tin),  SF(O.n), 
W(K„  +  \.n)  is  asymptotically  as  well  off  as  a  statistician  who  knows  all  the  vectors  Tin), 
RJO.fl),  j£(l  ,w),  ....  \V(K„  +  \,n).  This  is  so  because  given  Tin),  using  a  table  of  random 
numbers  it  is  possible  to  generate  additional  random  variables  so  the  joint  distribution  of  the 
additional  random  variables  and  the  elements  of  Tin),  jF(O.n),  lF(Af„  +  l,n)  is  the  joint  dis¬ 
tribution  given  by  h,j2).  But  Theorem  1  states  that  all  probabilities  computed  using  /r„(2)  are 
asymptotically  the  same  as  probabilities  computed  under  the  actual  pdf  h„n). 

/.  I 


f 


I 
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Theorem  2  implies  that  asymptotically  the  order  statistics  ( Y, . Tn/,  _,,  +, . 

K„)  contain  no  information  about  r„(x).  This  is  so  because  under  /)„(}>  the  conditional  distribu¬ 
tion  (given  Tin))  of  these  order  statistics  does  not  involve  r„ix). 

Taken  together,  the  two  theorems  imply  that  a  knowledge  of  Tin)  is  asymptotically  as 
good  as  a  knowledge  of  the  whole  sample,  for  the  purpose  of  testing  whether  r„(x)  -  0.  This 
assumes  that  we  have  to  deal  only  with  the  challenging  alternatives  described  in  Section  1,  but 
less  challenging  alternatives  do  not  pose  any  problem  asymptotically. 

5.  EXTENSION  TO  OTHER  CASES 

The  results  above  were  for  the  case  where  the  unknown  parameters  are  location  and  scale 
parameters.  In  other  cases,  it  may  not  be  possible  to  choose  p„  and  q„  that  will  guarantee  that 
assumptions  (1.5)  and  (1.6)  hold  for  all  0) . 0,„,  if  we  want  lim  p„  —  0  and  lim  q„  —  1. 

II  —  »—<* O 

But  if  we  fix  p  and  q  with  0  <  p  <  q  <  1 ,  an  analogue  of  Theorem  1  can  often  be  proved  with 
p„  replaced  by  p,  q„  replaced  by  q,  and  a ,(n),  y,in) 


defined  as  F0  1 


1 

np  + 

J  2 

L„ 

In 


fa  («,(»);  0i . if,,,) 

fo  («/(«);  0| . 0,„) 


respectively. 


where  0, . 0,„  are  estimates  of  0° . 0®,  based  on  I  Y„p,  Y„P  +  in .  Ym ) ■  Then,  if  we 

are  willing  to  ignore  departures  from  the  hypothesis  in  the  tails  of  the  distribution,  we  can  still 
use  only  the  order  statistics  (  Y„r,  Y,lp ,  ,  „ ,  -  Y„q  I . 


6.  APPLICATIONS 


For  the  case  where  m  =  2  and  0),  02  are  location  and  scale  parameters  respectively,  vari¬ 
ous  tests  based  on  Tin)  have  been  investigated  in  [2]  and  {61.  In  particular,  (21  contains  vari¬ 
ous  analogues  of  the  familiar  Wilk-Shapiro  test,  first  proposed  in  [3].  The  tests  in  (21  and  [6] 
were  based  on  Tin)  because  it  made  the  analysis  easier.  The  present  paper  gives  a  theoretical 
justification  for  basing  tests  on  these  sparse  order  statistics  alone. 


For  the  location  and  scale  parameter  case,  we  can  construct  other  tests,  as  follows.  For 


j  -  0,1 . K„,  let  V,in)  denote  Vn  /„ 


»Pn+jLn 


let  Z,(n)  denote  Vn  g 

0j 


G_,  »Pn  + 


-F,-' 


Y„Pn,iL-F;' 


np„  +  jL„ 


nPn+jL„ 


,  and 


It  was  shown  in  (4]  that  for  all  asymptotic  probability  calculations,  we  can  assume  that  the  joint 
distribution  of  (  F0(/t) . F*  («)  1  is  given  by  the  normal  pdf 
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c„exp 


n(L„-  1) 


2  L- 


i  L«Vjp 

+  T77—  x  +  LCvy-v,-!)* 

MPit  rtvl  0i|/  ,_i 


1  38 

Under  the  additional  condition  that  —  -  — - e  +  2y  <  0,  it  can  be  shown  that  for  all 


asymptotic  probability  calculations  we  can  assume  that  the  joint  distribution  of 


|Z0(/j) . ZK  («)}  is  given  by  the  normal  pdf  just  described.  Then,  if  we  define  p,  as - - 

Q’i 


f~L~ 

V  np„ 


1  + 


,  P  2  as 


and  the  observable  random  variables  Q0, 


Qk„  as 


for  j  —  1 . K„,  a  straightforward  computation  shows  that  for  all  asymptotic  probability  cal¬ 
culations  we  can  assume  that  Q0 ,  (?i . QK  are  independent,  each  with  a  normal  distribu¬ 

tion  with  standard  deviation  0°,  and  with 


£i<?o1  -  W  ' 


E\Q,\ 


Vn(L„-  1) 


-  h„(J-  1)  +  for  y  =  1 . K„, 


where  h„(J)  =*  j? 


0,°  +  02UG 


0/^-1 


:-i 


np„  +  jL„ 


np„+jL„ 


If  the  hypothesis  is  true,  F„ 


-i 


np„  +  jL„ 


n P„  +  .iL„ 


,  and  in  this  case  we  can  write  E[Q,}  —  A„(. /)© P  +  B„(j)0 2,  where 


/4„0),  fl„0)  are  known,  for  j  -  0 . K„.  So  we  have  reduced  our  hypothesis  testing  prob¬ 
lem  to  the  following:  we  observe  random  variables  Q0,  Q\ . QK„  which  are  independent 

and  normal,  each  with  the  same  standard  deviation  02,  which  is  unknown.  The  problem  is  to 
test  the  hypothesis  that  E\Q ,)  -  A „(./)#?  +  B„(J)9$>  for  some  unknown  0®,  where  A„(J)  and 
B„(j)  are  known  values,  for  j  “  0,1,  ....  K„,  against  alternatives  that  £l@,)  ”  <4„O)0p  + 
fl„O)02  +  A„0)<  where  A„(/)  is  unknown. 


The  formulation  of  the  problem  just  described  makes  it  easy  to  construct  various  tests. 
For  example,  suppose  for  convenience  that  K„  +  1  is  a  multiple  of  4.  Then  it  is  possible  to 

find  ^  ( AT,,  +  1 )  sets  of  nonrandom  quantities  |x„(4/),  X„(4/  +  l),  X„(4/  +  2),  X„(4/'  +  3); 

K  _  3 1  .  _ 

/-0 . — ^ — |  such  that  the  -j  (£„  +  !)  quantities  Q„(i)  —  X„(4 i)Q,  +  X„(4i  +  1)0, +|  + 


» -  -■iTiwpwnr —  ■■  **  **«*■•  < . 
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A„(4/  +  2)a>2  +  X.(4/  +  3)ft-+j 


/  —  0,  1 . 


can  be  assumed  to  be  independent 


normal  random  variables,  each  with  unknown  standard  deviation  9 2°,  and  with  E\Q„(i))  —  £ 


i-0 


X„(4i  +j)A„(4i  +j)  -  A„(/),  say,  where  A„(/)  is  unknown.  Then  the  hypothesis  to  be  tested 
is  that  A„(/)  -  0  for  all  i.  But  if  we  examine  the  development  above,  we  see  that  |A„(/))  is 


not  completely  arbitrary.  Instead,  A„0)  -  q„ 


4  i 


*„-3j 


» 


where  q„  (v)  is  a  continuous  function 


of  v  for  0  <  v  <  1.  If  we  have  some  particular  alternative  q„  (v)  against  which  to  test  the 
hypothesis,  a  likelihood  ratio  test  can  be  constructed.  If  we  want  to  test  against  a  very  wide 
class  of  alternatives,  we  could  apply  one  of  various  nonparametric  tests.  For  example,  we  could 
base  a  test  on  the  total  number  of  runs  of  positive  and  negative  elements  in  the  sequence 
((?„(/)}•  If  the  hypothesis  is  true^  there  should  be  a  relatively  large  number  of  runs,  but  if  the 
hypothesis  is  false,  neighboring  Q„(i)' s  would  tend  to  have  the  same  sign,  decreasing  the  total 
number  of  runs.  Other  tests  for  an  analogous  problem  are  developed  in  [7], 


In  the  case  where  £(.*) 


'/2n 


e  2  ,  all  the  conditions  imposed  above  hold  if  we  take  p„ 


1  -  -  0 (»'*),  «  =  y  -  A,,  8  -  -jy - y  -  A2,  y 


10 


~  A3,  p 


A? 

To  “  ~  A* 


where  Ah  A2,  A3,  A4  are  very  small  positive  values  chosen  so  that  c  >  0,  8  >  0,  y  >  0,  and  p 

>  2Ai. 
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ON  A  CLASS  OF  NASH-SOLVABLE  BIMATRIX  GAMES 
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ABSTRACT 

This  work  is  concerned  with  a  particular  class  of  bimatrix  games,  the  set  of 
equilibrium  points  of  which  games  possess  many  of  the  properties  of  solutions 
to  zero-sum  games,  including  susceptibility  to  solution  by  linear  programming. 
Results  in  a  more  general  setting  are  also  included.  Some  of  the  results  are  be¬ 
lieved  to  constitute  interesting  potential  additions  to  elementary  courses  in 
game  theory. 


1.  INTRODUCTION 

A  bimalrix  game  is  defined  by  an  ordered  pair  <A,B>  of  m  x  n  matrices  over  an  ordered 
field  F,  together  with  the  Cartesian  product  X  x  Y  of  all  m-dimensional  probability  vectors 
x  €  X  and  all  //-dimensional  probability  vectors  y  €  Y.  If  player  1  chooses  a  strategy  (probabil¬ 
ity  vector)  .v  and  player  2  chooses  a  strategy  .v,  the  payoffs  to  the  two  players,  respectively,  are 
xAy  and  xBy,  where  a  and  y  are  interpreted  appropriately  as  row  or  column  vectors.  A  pair 
<A*,y*>  in  X  x  Y  is  an  equilibrium  point  of  the  game  <  A,B>  if  x*Ay*  ^  xAy*  and  x*By*  > 
x*Bv,  for  all  probability  vectors  a  and  y. 

A  Nash-solvable  bimatrix  game  is  one  in  which,  if  <a*,.v*>  and  <a',v'>  are  both  equili¬ 
brium  points,  then  so  are  <a*.>'>  and  <a'j’*>.  It  is  well  known  that  0-sum  bimatrix  games 
( a„  +  b„  =  0,  all  /,./)  are  Nash-solvable,  and  that  this  properly  extends  to  constant-sum  games 
(a,,  +  b„  -  k,  all  for  some  k  €  F ).  It  is  also  well  known  that  in  the  constant-sum  case  all 
equilibrium  points  are  equivalent  in  that  they  provide  the  same  payoffs  to  both  players.  This 
work  generalizes,  slightly,  that  contained  in  such  sources  as  Luce  and  Raiffa  (9)  and  Burger 
(2),  and  represents  a  very  small  step  toward  the  solution  of  the  open  problem  of  characterizing 
Nash-solvable  games.  In  the  following,  A,,  will  be  the  rth  row  of  A  and  A.j  the  /th  column  of 
A,  and  similarly  for  B.  The  inner  product  of  2  vectors  u,  v  in  E"  will  be  denoted  by  (u,v).  The 
ordered  pair  is  <u,\> . 

2.  ROW-CONSTANT-SUM  BIMATRIX  GAMES 

DEFINITION  I:  An  m  x  n  bimatrix  game  <A,B>  is  row-constant-sum  if,  for  each 
i,  i  =■  1.  ...  m,  there  is  a  k,  €  Fsuch  that  a„  +  b„  -  k,.  j  **  I,  ...  n. 

THEOREM  1:  Let  <A*,y*>  and  <x',y’>  be  two  equilibrium  points  for  a  row-constant- 
sum  game  <A,B>.  Then  <x*,y*>  and  <x',>>'>  are  interchangeable,  and  they  are  equivalent 

m  m 

for  PI  (player  1).  They  are  equivalent  for  P2  (player  2)  if  and  only  if  £  x*k,  -  £  x'fc,. 

/-I  /-I 
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PROOF:  It  is  well  known  and  easily  proved  that  <x*,y*>  is  an  equilibrium  point  for 
<A,B>  if  and  only  if  x*  >  0  implies  that  ( A,..y *)  -  max (,4*  .  y*)  and  y*  >  0  implies  that 

(x*,B.,)  —  max(x*,fl.A),  for  all  i.  j.  Accordingly,  let  £*  —  x’By*.  Then  y*  >  0  implies 

(x*.B.,)  —  /3  *  ”  £  x*k,  -  (x*,  A.t)  >  £  x*k,  -  (x*„4.r)  for  all  r ,  or  (x*.  A.r)  ^  (x*.  A.,),  and 

/  » 

x*  >  0  implies  (/4,.,y*)  -  a*  -  max(/4t.,  >>*).  If  <x>  >  is  any  equilibrium  point,  then  we 

have  that  x’Ay*  >  x  'Ay*  (because  x*  is  in  equilibrium  with  .v*)  >  xAy’  (because  y'  is  in 
equilibrium  with  x'  and  by  the  above  argument)  >  x*Ay'  (because  x'is  in  equilibrium  with  /) 
^  x*Ay*  (because  y*  is  in  equilibrium  with  x*  and  by  the  above  argument).  Thus  <x*,y*> 
and  <x',.v'>  are  interchangeable  for  PI,  and  equivalent  for  PI.  To  show  they  are  interchange¬ 
able  for  P2,  note  that  x'By'  —  J  xjk,  —  x'Ay'  —  £  x,k,  -  x'Ay*.  or  x'Bv'  —  x'By*.  One  can  simi- 

I  I 

larly  show  that  x*By*  =  x’By'.  completing  this  part  of  the  proof. 

Suppose  now  that  £  x',k,  =  £  x*kr  Since  x*Ay*  —  x'/4.v*,  we  have  that  £  x,'A,  - 
/  /  » 
x 'Ay *  =  £  x*  k:  —  x’/tv*,  or  x'Bv*  =  x*  By*,  and  equivalence  follows. 

/ 

On  the  other  hand,  suppose  x*Ay*  =  x*.4v'  =  x'Ay*  =  x'Ay".  x*By*  —  x'By  -  x'By*  = 
x'By'.  Then  £  x*A,  -  .vM.r*  =  £  x,'A,  -  x'/ly*.  Since  ,v'4v*  =  .vMv*.  it  follows  that  £  x*A,  = 

i  /  i 

£  x/A,,  and  the  proof  is  complete. 


It  is  well  known  that,  if  A  ( =-B )  is  the  payoff  matrix  for  a  zero-sum  game,  optimal  stra¬ 
tegies  <x*,v*>  for  the  game  satisfy  the  so-called  "saddle-point"  property:  .vMv  ^  .vMv*  > 
xvf.v*  for  all  probability  vectors  x  and  y.  and  that,  conversely,  if  <x*,r*>  is  a  saddle-point  of 
the  function  .\v4.v,  then  <x*..v*>  is  a  solution  to  the  game  A. 


THEOREM  2:  <x*.y*>  is  an  equilibrium  point  of  the  row-constanl-sum  game  <A.B> 
if,  and  only  if,  <x*.y*>  is  a  saddle-point  of  the  function  <t>(x.y)  =  av4v. 

PROOF:  lf<  v*,v*>  is  an  equilibrium  point  of  <A.B>,  then  v*4v*  ^  xAy*  for  all 
x  €  X.  from  which  half  of  one  implication  follows.  Now,  let  K  be  the  m  *  n  matrix 


A'  = 


A, 

Aj  . 

*  ^1 

A  2 

A  2  • 

•  A? 

A„ 

A«  • 

•  A„ 

of  row  constants  A,.  a„  +  bh  =  A,. 


Since  x'By*  >  x'By  for  all  y  €  K,  we  have  x*(A  -  A)y*  >  x*(A  -  4).v,  from  which 
x'Ay  ^  x’Ay*  | since  x'Ky*  -  x'Ky 


£  x*A,J.  This  completes  one  implication.  Suppose 

I 


now 


that  <x*,.v*>  is  a  saddle-point  of  4>.  From  x'Ay  ^  x’Ay*  it  follows  that  .v*—  0  if  {x*,A.,)  > 


a ’  —  min(x*,4.t),  from  which,  if  y*  >  0.  £  x*A,  -  (x*„4.,)  >  £  xfA,  -  (x*  ,A.k)  for  all  A,  or 

A  i- 1  i 

(x*,fl.,)  Xx’.B.*)  for  all  A.  Finally,  it  follows  from  x’Ay*  >  xAy*  for  all  x  that  x*«0  if 
(•4,.,y*)  <  max(/4n.,y*),  and  the  proof  is  complete. 


The  implication  is  that  any  solution  of  A  as  a  0-sum  game  is  also  an  equilibrium  point  of 
the  row-constanl-sum  bimatrix  game  <A.B> ,  and  conversely.  Thus,  a  solution  of  A  found  by 
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linear  programming  will  provide  an  equilibrium  point  <.v\v*>  for  <A,B>  and  the  payoff  a 
for  PI.  The  payoff /3  for  P2  must  be  calculated  via  x*By*,  or  via  £  x*k,  -  «. 

3.  A  SOMEWHAT  MORE  GENERAL  SETTING 

We  now  consider  the  m  x  n  matrix  A,  we  lei  B  be  m  x  n  (not  necessarily  in  row- 
constant-sum  with  A)  and  we  henceforth  let  X  x  >  be  the  set  of  solutions  to  A  as  a  0-sum 
game.  The  following  theorem  then  follows. 

THEOREM  3:  Let  <.v*,.r*>  €  X  x  )’.  In  order  for  <.v*.v*>  to  be  an  equilibrium  point 
of  < -f.fi>  regarded  as  a  bimatrix  game,  it  is  necessary  and  sufficient  that  x*By*  x*By  for  all 
probability  vectors  y,  or,  for  x*(~B)y  >  v*(-fih*.  It  is  clearly  sufficient  for  <.v\v*>  to  also 
be  a  solution  to  (— fi),  regarded  as  a  0-sum  game. 

The  proof  is  omitted,  as  it  follows  immediately  from  the  definition  of  equilibrium  point. 
The  following  comment  is  made,  however:  if  <A.B>  is  row-constant-sum,  a  point  <v*,y*> 
that  solves  A  as  a  0-sum  game  and  is  an  equilibrium  point  of  < -f.fi>  ,  will  not  necessarily  solve 
( —B )  as  a  0-suni  game,  because  the  condition  x*(-B)y*  >  x(—B)y*  holds  if  and  only  if 

m  m 

x*Ay*  -  ]T  >  v.4>'*  -  £  -VA.  or  -V*  Ay*  -  xAy*>  £  A, (a*  —  x,).  Thus,  the  condition 
/""I  /  /■=  1 

that  <  v‘,i*>  also  solve  (-fi)  as  a  0-sum  game  is  extremely  strong.  This  illustrates  a  major 
difference  between  the  constant-sum  case  (in  which  the  above  condition  will  hold  if  <  v*,i  *> 
solves  .4  as  a  0-sum  game)  and  the  row-constant-sum  case.  It  is  also  logical  to  ask  if  there  are 
conditions  on  .4  and  B  which  would  cause  an  equilibrium  point  of  <  A,B>  to  also  solve  A  and 
-B  as  separate  0-sum  games.  The  conditions  are  inescapable:  y*  >  0  must  imply 
(v*..4.,)  =  min  (.v*..4.;)  and  x*  >  0  must  imply  ( B,..y *)  =  min  (Bk.. »■*).  Since,  for  example. 

k  k 

to  be  an  equilibrium  point  of  <  A,B>  it  is  necessary  that  v*  >  0  imply 
(x*.B.  )  =  max  Lx*.B.k).  any  game  satisfying  these  conditions  must  be  heavily  restricted. 

Finally,  it  is  noted  that  if  there  are  common  saddle-points  of  A  and  (-B),  which  are  therefore 
equilibrium  points  of  the  bimatrix  game  <A,B>,  each  of  these  saddle-points  will  necessarily 
provide  the  same  payoffs  «,  fi  to  the  respective  players  (note  the  contrast  of  the  row-constant- 
sum  case  with  the  constant-sum  case). 

DEFINITION  2:  A  Nash  Subset  tor  a  game  <A,B>  is  a  set  S  =  |<  v,r>|  of  equili¬ 
brium  points  for  < A,B>  such  that,  if  <.v,.c>  and  <.v,.i'>  are  in  S.  so  are  <  v,i>  and 
<  v.i  >.  See  (6)  and  (13)  for  related  material. 

THEOREM  4:  Let  A  and  B  be  in  x  n  matrices  over  the  ordered  field  h\  and  let  A  x  )  be 
the  set  of  all  solutions  to  A  regarded  as  a  0-sum  game.  In  order  for  .V  x  )  to  constitute  a  Nash 
subset  of  equilibrium  points  for  <  A.B>,  regarded  as  a  bimatrix  game,  it  is  necessary  and 
sufficient  that  A  (A  )  =  \k\  (x.*A.k)  =  min  <.v‘..4.,),  all  v*  €  ,V)c  A  (A  )  =  \k\(x*.B.k)  = 

max  ( x'.B .,),  all  .v*  C  X  |. 

PROOF:  Write  A  =  K(X),  A  =  A'(Af).  and  let  Ac  A'.  Then  because  any  <x*,y*>  in 
A'  x  Y  solves  A  as  a  0-sum  game,  x*Ay  ^  x'Ay*  >  .xvf.v*  for  all  <x *..»*>  in  A'  x  Land  all 

probability  vectors  x,  v.  Also,  y*=  0  if  (.v*,^.,)  >  min  (.v*./f.*),  or  if  /  ?  A  C  A'.  Hence 

y* -  0  if  (x*,B.,)  <  max  ( x'.B .,)  for  all  y*€  Y.  any  v*  €  A,  and  <x*,v'>  is  an  equilibrium 

point  for  <A,B>,  for  all  <x*.y*>  €  X  x  Y.  Suppose  there  exists  k'  €  A  —  A’,  so  that  for 
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some  x*  €  X.(x*,B.k)  <  max  (x'.B.,)  but  )  ■»  minU*, /(.,).  Since  it  is  known  that 

there  exists  y'  €  F  (see  (1).  page  52)  such  that  yjf  >  0.  it  follows  that  y‘  cannot  be  in  equili¬ 
brium  with  v*  for  <A.B>  regarded  as  a  bimatrix  game,  a  contradiction.  This  completes  the 
proof. 

COROLLARY  1:  Let  X*  x  Y*  be  any  subset  of  X  x  Y,  the  set  of  all  solutions  to  A 
regarded  as  a  0-sum  game.  In  order  for  X*  x  Y*  to  be  a  set  of  interchangeable  equilibrium 
points  (a  Nash  subset)  for  <A.B>  regarded  as  a  bimatrix  game,  it  is  sufficient  that  A  (A’*)  - 
{A  I  (.v*./l.<)  -  min  LvVL.)  for  all  v*  €  AT*}  -  A  (AT*)  -  [k\(x*.B.k)  =  max  (x*.B.,)  for  all  .v* 

€  X*). 

COROLLARY  2:  Let  X  c  .V,  and  let  K(X)  be  defined  as  above,  and  let 
)  =  {i  €  Fly,  >  0  implies ,/  €  A'  (  V  )|.  Then  X  x  >  is  a  Nash  subset  for  <.4. B> . 

Finally,  we  consider  the  construction  of  all  matrices  B  such  that  X  x  Y.  the  set  of  solu¬ 
tions  to  A  as  a  O-sunt  game,  will  also  be  a  set  of  equilibrium  points  for  <  4 ,B>  regarded  as  a 
bimatrix  game. 

THEOREM  5:  Let  A  be  an  m  x  n  matrix  over  h.  with  X  x  F  its  solutions  as  a  0-sum 
game.  Then  a  matrix  B  can  be  constructed  such  that  .V  x  }  is  a  Nash  subset  for  <  A,B> 
regarded  as  a  bimatrix  game.  The  equilibrium  points  <v.y>  in  .V  x  }  may  or  may  not  be 
equivalent  for  P2,  depending  on  construction.  Further,  all  matrices  B  such  that  X  x  F  is  a 
Nash  subset  for  <  A.B>  are  constructed  as  described. 

PROOF:  Let  .v\  v’ . v*  be  the  extreme  points  of  X,  and  assume  that  v1 . v',  r  ^  k. 

.v1 

are  a  maximal  linearly-independent  subset  of  v1 . v\  Let  \  = 

xr 

matrix  of  a  linear  transformation  from  E"’  to  E\  taken  with  respect  to  a  basis  of  unit  vectors, 
and  let  c'.r,  ...  '  be  a  basis  for  the  nullspace  of  x  Let  0 fi .  ..0,  be  scalars.  Let 

i 

y'.  y\  ■■  v‘  be  the  extreme  points  of  the  set  )'.  and  let  A',./  =  {/l.r/  >  0|.  Let  Kt  -  U  A',/. 

Let  D  ~  |</|(.v  ,d)  =  0,1  ^  j  S;  r|.  and  let  </'.  ...  </"'  ,H  be  m  -  r  +  1  (if  some  0  ^0) 

linearly-independent  solutions  to  the  system  of  r  equations  in  m  variables.  For  j  €  A',,  let 

tfl  •  *  I  '  nt  r  *  | 

B.,  =  £  «„</  +  £  A  i  where  £  <«,  .=  l  (or  at  least,  £«„  =  «  for  some  «;*  Oj,  all 

/.  Then,  if  v  €  .V,  there  arc  scalars  y,.  i  =  I,  . . .  r.  such  that  ,v  —  £  y,.v’,  and  for 

1.i-i 

2jy,.v  .  B  ,  =  j  y,.v',  « In(lr  +  £)/  =  (if  <*  -  1). 

After  all  B.  ,  J€.K t.  have  been  constructed,  for  ./  ?  A*.  let  B.,  be  such  that 
(v.fl.,)  $;  (\  ,B.,t).  h  €  A),  for  all  extreme  points  x'.  i  =  1.  ...  k.  Then,  for  all  y*  6  F,  v* 

€  V  Jwith  v*  =  J)  y * .v'  ,  x*By*  =  J  yJ8  >  v ’By  for  all  probability  vectors  v.  Hence 

.V  x  F  is  a  set  of  interchangeable  equilibrium  points  for  <A.B>  that  would,  for  example,  be 
equivalent  if  0  =  0,  for  all  i.  ./. 

Finally,  suppose  there  is  a  matrix  B  such  that  .V  x  F  is  a  Nash  subset  for  <  A.B>  but 
which  does  not  have  the  above  construction.  Then  there  is  a  column  B.,.  j  €  Ky.  such  that 


be  regarded  as  the 
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»•;  /  »  I  m  t  m  t  -*  I  m  > 

cither  A.  ^  £  <»..</'  +  £  A„i'foranv  coefficients  or  B,  =  £  «„</'  +  £  A  c  hui 

£  a  =  o  =*  <>,  =  l.AtA).  A  =*./.  In  the  first  instance  we  note  ix'.H.,)  -  £;.  I  -  1.  .../. 

-  i 

ami  we  contradict  the  assumption  that  </'.  ...  </"'  ' ’ 1  are  a  maximal  linearly-independent  set  of 

HI  I  *  I 

solutions  to  <  v  .</)  =  ,/=]....  r.  In  the  second  instance,  if  £  ~  ^  1-  lei  \  = 

/- 1 

■  i  ' 

52  y  Then  tv. A.  )  =»  <>  £  y  /J,  =  (v./f.;)=£  y.JJ  for  other  A  C  A,.  so  that  am 

i  .•  i  /-i 

equilibrium  strategy  y  will  either  exclude  ,/.  or  include  j  and  exclude  any  A  such  that  <><  =  I 
lather  contradicts  the  definition  of  A) . 

Note  that  the  matrix  I  is  used  only  to  define  A  x  ).  Ciiven  the  set  of  A  x  ).  it  follows 
that  both  I  and  H  could  he  constructed  as  described,  assuming  the  appropriate  dimensionality 
conditions. 

4.  CONCl.l  SIONS 

It  is  hoped  that  this  slight  extension  of  previously  published  material  regarding  Nash- 
solvable  bimatrix  games  will  lend  itself  to  inclusion  in  future  texts  in  game  theory  and  opera¬ 
tions  research  covering  2-person.  O-sum  finite  games  (matrix  games'  Clearly,  nearly  any  state¬ 
ment  that  can  be  made  about  solutions  of  matrix  games  can  also  be  made  about  the  somewhat 
more  interesting  row -constant-sum  bimatrix  case,  and  the  usual  methods  for  finding  such  solu¬ 
tions  carry  over  with  the  minor  modifications  indicated  1  lie  reader  is  also  referred  to  the 
excellent  text  by  Vorobyev  t2l'.  and  Ins  discussion  on  "almost  antagonistic”  bimatrix  games 
tpp.  10.'- 1 1st  for  related  interesting  material 
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ABSTRACT 

ThK  paper  gives  vharacleri/aiions  of  optimal  solutions  lor  convex  semi- 
Infinite  programming  problems  These  characterizations  are  free  of  a  constraint 
nualilicatton  assumption  Thus  they  overcome  the  deficiencies  of  the  semi- 
infinite  versions  of  the  I'rit/  John  anil  the  Kuhn-Tucker  theories,  which  give 
only  necessary  or  sufficient  conditions  lot  optimality,  but  not  both 


1.  INTRODUCTION 

A  mathematical  programming  problem  with  infinitely  many  constraints  is  termed  a  "semi¬ 
infinite  programming  problem."  Such  problems  occur  in  many  situations  including  production 
scheduling  [10],  air  pollution  problems  [6], [7J ,  approximation  theory  [5],  statistics  and  proba¬ 
bility  [9].  For  a  rather  extensive  bibliography  on  semi-infinite  programming  the  reader  is 
referred  to  [8]. 

The  purpose  of  this  paper  is  to  give  necessary  and  sufficient  conditions  of  optimality  for 
convex  semi-infinite  programming  problems.  It  is  well  known  that  the  semi-infinite  versions  of 
both  the  Fritz  John  and  the  Kuhn-Tucker  theories  fail  to  characterize  optimality  (even  in  the 
linear  case)  unless  a  certain  hypothesis,  known  as  a  "constraint  qualification,"  is  imposed  on  the 
problem,  e  g.  [4], [12].  This  paper  gives  a  characterization  of  optimality  without  assuming  a 
constraint  qualification. 


'This  research  was  partially  supported  by  Project  No  NR047-02I.  ONR  Contract  NOOOI4-7.S-CO.Sft9  with  the  Center  lor 
C  ybernetic  Studies.  The  University  of  Texas  and  by  the  National  Research  Council  of  Canada 
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n„mh^h  fC  '  theorems  without  a  constraint  qualification  for  ordinary  (i  e  with  a  finite 
SUSlh  f  C?n!‘rr,Su  mathema"cal  Programming  problems  have  been  obtained  in  [11  It 
h  "°  thai  ,he  analysis  of  the  semi-infinite  case  is  significantly  different  the  special 
eature  being  here  the  topological  properties  of  all  constraint  functions  including  the  partkular 
role  played  by  the  nonbinding  constraints.  8  parncuiar 

The  optimality  conditions  are  given  in  Section  2  for  differentiable  convex  st.., .-infinite 
programming  programs,  whose  constraint  functions  have  the  "uniform  mean  value  property” 
This  class  of  programs  is  quite  large  and  it  includes  programs  with  arbitrary  convex  objective 

iUm  i°n,l dnd  mear  °r  uStr,Ctly  C0nvex  constraints  F°r  a  particular  class  of  such  programs 
namely  the  programs  with  "uniformly  decreasing"  constraint  functions,  the  optimality  conditions 

“  sh“”"  i "  s'“°"  «•  A  comp, risen  »i,h  the  semi-irtLe  of  Z 

o  bJ  im-ar  Cbeb.shlv  '  *  l!  "'I8'"''11  in  S'‘"0ri  5  An  application  to  the  problem 

of  best  linear  Chebyshev  approx.mat.on  with  constraints  is  demonstrated  in  Section  6  A  linear 

from  l41- for  which  ,hs  K“h-T^  ^ 

2-  ™v™ro“  FOR  PROGRAMS  H*V,NC 


(P) 


Consider  the  convex  semi-infinite  programming  problem 


Min  f"(x) 


s.t. 


,/*(.v.r)  s?  0  for  all  t  6  Tk,  k  €  P  A  {1 . p) 

x  €  R" 


where 


f"  is  convex  and  differentiable, 


./Hx.i)  is  convex  and  differentiable  in  v  for  every  t  €  Tk  and  continuous  in  t  for  every  .v. 
7*  is  a  compact  subset  of  R‘  (/  >  |). 

The  feasible  set  of  problem  (P)  is 

F  •  \x  €  R":/k(x,r)  ^  0  for  all  r  €  Tk,  k  €  P }. 

Note  that  F  is  a  convex  set  being  the  intersection  of  convex  sets. 

For  x*  t  F, 

Tt  A  {/  6  Tk.  /‘(.v*,r )  =  01. 

P*  A  (k  €  P:  T*k  *  0}. 

A  vector  d  €  R"  is  called  a  feasible  direction  at  v*  if  v*  +  d  €  F.  For  a  given  function  /*(•.,). 
A  6  |0|  U  P  and  for  a  fixed  /  €  7\ ,  wc  define 

Dk(x'.t)  A  {d  €  R”:  3  «  >  0  3  +  ad.t)  =  fk(x,.t)  for  all  0  o  ^  a). 
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This  set  is  called  the  cone  of  directions  of  constancy  in  (1],  where  it  has  been  shown  that,  for  a 
differentiable  convex  function  /*(',r),  it  is  a  convex  cone  contained  in  the  subspace 

I d:  d'Vfix'.t)  =  0). 

Furthermore,  if /*(■,/)  is  an  analytic  convex  function,  then  Dki x*,t)  is  a  subspace  (not  depend¬ 
ing  on  x  ),  see  [1,  Example  4].  In  the  sequel  the  derivative  of  ./  with  respect  to  x,  i.e. 
V,/(x,r),  is  denoted  by  V/(x,r). 

Optimality  conditions  will  be  given  for  problem  ( P )  if  the  constraint  functions  have  the 
"uniform  mean  value  property"  which  is  defined  as  follows. 

DEFINITION  1:  Let  T  be  a  compact  set  in  R1.  A  function  /:  R"  x  T  —  R  has  the  uni¬ 
form  mean  value  property  at  x  €  R"  if,  for  every  nonzero  d  €  R"  and  every  a  >  0.  there 
exists  a  =  aid, a),  0  <  «  ^  a  such  that 

(MV)  /ijL+ «*>>-/<«■'>  *  Vf(x  +  -dl)  for  eveI>  ,  s  r 

If  fi-.r)  is  a  linear  function  in  x  for  every  t  6  T,  i.e.  if  /is  of  the  form 
fix.l)  =  git)  +  £  x,g,it), 

l~  I 

or  if  /(•,/)  is  a  differentiable  strictly  convex  function  in  x  for  every  t  €  T,  i.e.  if 

,/Ux  +  (1  -  k)y,t)  <  \ fix, t )  +  (1  —  \.)f{y,t)  for  every  t  €  T 

where  €  R"  is  arbitrary,  y  .v,  0  <  X  <  1,  and  if  fix,  •)  is  continuous  in  t  for  every  x,  then 

./  has  the  uniform  mean  value  property.  For  a  linear  function  /,  one  finds  rf'V/(x  +  ad,t)  - 

£  d^, it)  and  (MV)  is  obviously  satisfied.  The  mean  value  property  for  strictly  convex  func- 

i«i 

lions  follows  immediately  from  e  g.  [14,  Corollary  25.5.1  and  Theorem  25.7], 

EXAMPLE 1:  Function 

fHx.t)  —  r[(x  —  t)2  —  r2]  for  every  t  €  T  =  [0. 1] 

is  neither  linear  nor  strictly  convex  in  x  €  R  for  every  t  6  T.  However  /’  has  the  uniform 

mean  value  property.  Function 

Xi2  +  rx>(x2  -  /)  if  x 2  <  y  t 
f2ixt.x2.t)  =  3 

X\  +  (x2  -  t  +  1)  (x2-  1)  if  x2  ^  —  t 

for  every  t  €  T  =  [0, 1]  does  not  have  the  uniform  mean  value  property  at  the  origin.  Note 
that  f2  is  convex  and  differentiable  in  x  6  R2  for  every  t  €  T  and  continuous  in  t  €  T  for 
every  x.  This  function  has  provided  counterexamples  to  some  of  our  early  conjectures. 

Optimality  conditions  will  now  be  given  for  problem  iP ). 

THEOREM  1:  Let  x*  be  a  feasible  solution  of  problem  iP )  where  /*,  k  6  P*  have  the 
uniform  mean  value  property.  Then  x*  is  an  optimal  solution  of  iP)  if,  and  only  if,  for  every 
a  *  >  0  the  system 
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(A) 

(B) 

(C) 


d'vr(x')  <  0  , 

rf'V/V  +  aV.r)  <  0  for  ail  t  6  T*k. 
d'Vfk Or*  -EaV.r)  ^  _  _L  for  all  ,  6  7;\rj. 

*  €  /»* 


is  inconsistent. 

PROOF:  We  will  show  that  x*  is  nonoptimal  if,  and  only  if,  there  exists  a  *  >  0  such 
that  the  system  (A),  (B).  (C)  is  consistent.  A  feasible  x *  is  nonoptimal  if,  and  only  if,  there 
exist  a  >  0  and  d  €  R",  d  *  0,  such  that 

(1)  /"(x*  +  ad)  <  fix') 

(2)  /*(x*  +  ad.i)  <  0  for  every  i  6  T*. 

k  (l  P. 

By  the  convexity  of  f  and  the  gradient  inequality,  the  existence  of  a  >  0  satisfying  (1)  is 
equivalent  to 

d'vr(x')  <  o. 

By  the  continuity  of  /*(-,r),  k  €  P.  the  constraints  with  k  €  P\P*  can  be  omitted  from  discus¬ 
sion.  We  consider  (2),  for  some  given  k  €  P *,  and  discuss  separately  the  two  cases:  t  €  T * 
and  t  €  Tk\T*.  Thus  (2)  can  be  written 

(2-a)  /*(x*  +  ad.t)  <  0  for  every  t  €  TJ 

(2-b)  /*(x*  +  ad.i )  <  0  for  every  t  6  T\T*. 

Consider  first  (2-a)  for  some  fixed  k  €  P*.  By  the  convexity  and  uniform  mean  value  property 
of/*. 

(3)  /*(x*  +  ad.i )  3s  /*(xV)  +  V/‘(x*  +  akd.t)  for  all  i  €  T*k 

and  for  some 

0  <  ak  <  a. 

Since  t  €  T *  and  a  >  0,  (2-a)  implies 

(4)  <fV/*(x*  +  akd.l)  <  0. 

Denote 

(5)  a  -  min  |aj. 

kc  P* 

Clearly,  a  always  exists  (since  F  is  finite)  and  it  is  positive.  By  the  convexity  of  /*(•,/),  (5) 
and  (4), 

(6)  rf'V/*(x*  +  o</./)  <  </'V/*(x'  +  akd,l)  ^  0. 

On  the  other  hand,  the  existence  of  «*  >  0  such  that,  for  some  t  €  T*k  and  all  k  €  /**, 
rf'V/*(x*  +  «  V./)  ^  0 
implies  (2-a)  with  0  <  a  <  a*. 
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It  is  left  to  show  that  the  existence  of  a  >  0,  such  that  (2-b)  holds,  is  equivalent  to  the 
existence  of  a  >  0,  such  that  (C)  holds.  Suppose  that  (2-b)  holds  for  some  a  >  0.  Then,  by 
the  convexity  and  uniform  mean  value  property,  for  k  €  P*, 

0  >  /Or*  +  ad.t)  >  /Or*./)  +  arf'V/Hx*  +  &kd,t)  for  all  /  €  Tk\T\ 

and  for  some 


(7) 

Hence, 


(8) 

Denote 

(9) 


0  <  atk  ^  a. 


d'Vfk(x*  +  akd,t )  1 


/(.vV) 


>  —  — ,  since  /  €  T\T*k 

a 


>  -  — ,  by  (7). 


a  =  min  (ut 

k  €  P' 


>  0. 


Using  the  monotonicity  of  the  gradient  of  the  convex  function  /(•,/),  one  obtains  here 

/' V7  fkl 


(10) 

This  gives 


d'V /*(*•  +  ad,t)  ^  d'Vfk(x*+akd.t )  ,  „  „ 

- - : — - - -  >  - — - - -  for  every  0  <  a  <  a,. 

/(.v*./)  /(.v*./)  ‘ 


d'V  fk(x*  +  ad,t)  ^  1  L  ,lrt,  , 

- =— - - -  ^ - .  by  (10)  and  (8) 

/(.v*./)  ak 

1 


>  -  -f.  by  (9) 
<* 


which  is  (C)  with  a*  =  d. 


Suppose  now  that  (C)  is  true  for  some  a*  >  0.  Using  again  the  monotonicity  of  the  gra¬ 
dient  of  the  convex  function  /(•,/),  and  the  fact  that  fk(x*,r)  <  0  for  /  €  7/7/  one  easily 
obtains 

(11)  /(.v*,/)  +  a  V'V/(.v*  +  ad.t)  ^  0,  for  every  0  <  a  <  a*. 

But 

/Or*  +  a  *d,t)  =  /(.v*./)  +  a  W/Oc*  +  akd,t), 

for  some  particular  0  <  ak  <  a*.  ak  =  ak(t) 
by  the  mean  value  theorem 
<  0,  by  (11) 

which  is  (2-b)  with  a  =  a*. 


Summarizing  the  above  results  one  derives  the  following  conclusion:  If  x*  is  not  optimal 
then  there  exists  o*  -  min(d.d)  >  0  such  that  the  system  (A),  (B)  and  (C)  is  consistent.  If 
there  exists  a*  >  0  such  that  the  system  (A),  (B)  and  (C)  is  consistent,  then  there  exist 
a,,  >  0  and  a  >  0  such  that 
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(12) 


fix*  +  a„d )  <  /“(*•) 

fk(x*  +  ad.t)  ^  0  for  every  t  €  T * 

/*( x*  +  ad.t)  <  0  for  every  t  €  TiST * 

k  €  P*. 


If  one  denotes 

a  =  min|a„,a)  >  0 

then,  again  by  the  convexity  of  /*(\f),  k  6  {0)  U  P ,  (12)  can  be  written 

f‘(x*  +  ad)  <  f°(x*) 

/*( x*  +  ad.t)  <  0  for  every  t  6  Tk, 

k  €  P* 

implying  that  x*is  not  optimal. 


□ 


REMARK  1:  Since  V/*(or,  •)  is  continuous  for  every  x  in  some  neighbourhood  of  x * 
(this  follows  from  e  g.  [14,  Theorem  25.7]),  condition  (C)  in  Theorem  1  needs  checking  only 
at  the  points  t  €  Tk  which  are  in 


NkA  U  NU*). 

=  <•«  r; 


where  A(r*)  is  a  fixed  open  neighbourhood  of  /*.  For  the  points  fin  one  can  always  find 

a  *  which  satisfies  (C).  This  follows  from  the  fact  that  for  every  a. 


(13) 


</' V  fk(x*  +  ad.t) 
fHx'.t) 


Ss  —M 


for  some  positive  constant  A/,  by  the  compactness  of  T\Ni.  Choose  M  in  (13)  large  enough, 
so  that 


(14) 


a  *  A  —  <  «. 
=  M 


Now, 


d'Vfk(x*+_a  *d.,)_  d'VAx'+td.t)  (J0)  and  ()4) 

J*(x*t)  fk(x*.t) 


> - by  (13)  and  (14). 

a 


EXAMPLE  2:  The  purpose  of  this  example  is  to  show  that  Theorem  1  fails  if  the  con¬ 
straint  functions  do  not  have  the  uniform  mean  value  property.  Consider 

Min  —x2 


subject  to 


f(x\.x1.t)  <  0  for  all  t  €  [0. 1] 
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where 


X\  +  /Aj(a2  -  I ) 

J(X\ mX2,t)  -  J 

A-,3  +  —  _  ~y  (x2  -  I  +  DUj 

Function  /  satisfies  the  assumptions  of  problem  (/*)  but  it  does  not  enjoy  the  uniform  mean 
value  property.  The  feasible  set  is 

and  the  optimal  solution  is  a  —  (0, 1)'.  However,  for  every  a*  >  0,  the  system  (A),  (B)  and 
(C)  is  inconsistent  at  a*  —  0,  a  nonoptimal  point.  Since  T *  —  10. 1],  condition  (C)  is  here 
redundant,  while  (A)  and  (B)  become,  respectively, 

-d2  <  0 

2a  V,3  +  /(2a  V2  -  l)d2  <  0  if  2a  *d2  <  I 

2 a  V,3  +  ■  --y  (2a  *d2  -  l )  d2  ^  0  if  2a  *d2  ^  /. 

The  above  system  cannot  be  consistent  for  some  a  *  >  0,  because,  if  it  were,  the  last  inequality 
would  be  absurd  for  small  /  €  [0. 1). 

When  the  constraint  functions  (but  not  necessarily  the  objective  function)  are  linear,  i.e. 
when  ( P )  is  of  the  form 

(L) 

Min  /"(a) 
s.t. 

jj*(/)  +  ]£  .v, .?,*(/)  ^  0,  for  all  /  6  Tk,  k  6  P 

i-\ 

then  Theorem  1  can  be  considerably  simplified. 

COROLLARY  1:  Let  a*  be  a  feasible  solution  of  problem  (L).  Then  a*  is  optimal  if,  and 
only  if,  the  system 

(a)  d'v.n a*)  <  o 

(Bj)  £  <U*(r)  <  0.  for  all  r  €  T{ 

i-t 

Z 

(C,)  - ^  -1.  for  all  t  €  Tk\Tl 

g,H r )  +  Z  x*i ?.*(*> 

i-i 

k  Z  P* 


is  inconsistent. 
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PROOF:  Recall  that  linear  functions  have  the  uniform  mean  value  property.  If  is 

linear,  then  for  every  t  €  Tk 

Dk(x,t)  -  [d  €  R  ":  d"7fk(x.t)  -  0). 

Thus  (B)  reduces  to  (B2).  The  left  hand  side  of  (C)  reduces  to  the  left  hand  side  of  (C^, 
which  does  not  depend  on  a  *.  Moreover,  a  *  on  the  right  hand  side  of  (C)  can  be  taken 

a*  =*  1,  because  whenever  satisfies  (A)  and  (B2),  so  does  d  -  —  d. 

a 


□ 


In  many  practical  situations  the  sets  Tk%  k  €  P  are  compact  intervals  and  the  sets  T *, 
k  €  P *  are  finite.  (This  is  always  the  case  when  fix'.  ■)  are  analytic  functions  not  identically 
zero.)  For  such  cases  condition  (B)  can  be  replaced  by  a  finite  number  of  linear  inequalities. 

COROLLARY  2:  Let  x*  be  a  feasible  solution  of  problem  (P),  where  /*,  k  6  P *  have 
the  uniform  mean  value  property.  Suppose  that  all  the  sets  T *,  k  €  P*  are  finite.  Then  a 
feasible  solution  .v*  of  problem  ( P )  is  optimal  if,  and  only  if,  for  every  a*  >  0  and  for  every 
subset  (1  n  of  T *  the  system 

d'Vf'(x')  <  0 

d'Vfk(x*.t)  <  0.  t  €  nk 
d  €  Dk(x\r),  t  6  T:\il k 

d'Vfk(x*  +  a'd.t)  >  _  _1_ 

,lHx*.t)  "  «* 

for  all  i  €  Tk\Tt  . 

k  €  P* 

is  inconsistent. 


(A) 

(B,) 

(C) 


An  important  special  case  of  Corollary  2  is  when  the  sets  Tk  themselves  are  finite.  Then 
problem  (P)  can  be  reduced  to  a  mathematical  program  of  the  form 

(MP) 


Min  f"(x) 
s.t. 

fix)  s£  0,  k  €  P. 

This  is  obtained  by  setting  Tk  =  [k\.k2 . 


k  =  1,2 . p\  with 


!/'*(.v):  k  €  P  A  {1.2. 


k. 


ardT. 


and  identifying  (fk(x.k,):  k,  6  Tk , 
card  Tk\  1.  Here  P*  •=  { k  €  P  . 


fk(x*)  =  0).  Also  1  Dk(,x',k,):  k,  €  Tk%  k  =  1,2 . p)  is  denoted  by  |£>^.v*):  k  €  P). 


The  major  difference  between  the  semi-infinite  problem  (P)  and  the  mathematical  prob¬ 
lem  (MP)  is  that  for  the  latter  the  condition  (C)  is  redundant;  Theorem  1  then  reduces  to  the 
following  result  obtained  in  (I,  Theorem  Ij. 
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COROLLARY  3:  Consider  problem  ( MP ),  where  {/*:  k  €  |0|  U  P )  are  differentiable 
convex  functions:  R"  —  R.  A  feasible  solution  .v*  of  (MP)  is  optimal  if,  and  only  if,  for  every 
subset  11  of  P*  the  system 

</'V/"(.v*)  <  0 

<  0,  k  €  11 

d  €  DK(xm).  k  €  P*\H 

is  inconsistent. 


PROOF:  Here  condition  (C)  becomes 


</'V/(.v*  +  «  V) 
Jk(x*) 


>  -  -V  A  €  P\P* 
a 


for  some  a  *  >0  Since  here  the  set  P\P*  is  finite,  and  hence  compact,  the  redundancy  of 
condition  (C)  is  shown  as  in  Remark  1. 


□ 


The  following  result  gives  a  characterization  of  a  unique  optimal  solution  of  problem  (P). 

THEOREM  2:  Let  .v*  be  a  feasible  solution  of  problem  (P).  where  /*.  A  €  P*  have  the 
uniform  mean  value  property.  Then  .v*  is  a  unique  optimal  solution  of  problem  (P)  if,  and 
only  if,  for  every  <»  *  >  0  there  is  no  {/satisfying  conditions  (B).  (C)  and 

(A,)  </' V/"(.v*)  <  0or</  €  />„(.v*). 

PROOF:  Suppose  that  the  system  (At).  (B).  (C)  is  inconsistent.  Then  so  is  the  system 
(A),  (B),  (C).  Hence,  by  Theorem  1,  x*  is  an  optimal  solution.  Suppose  that  .v*  is  not  a 
unique  optimal  solution.  Then  there  exist  a  >  0  and  d  *  0  such  that  .v  -  x*  +  ad  is  feasible, 
which  implies  that  d  satisfies  ( B) ,  (C)  and  P’l.v*)  —  /’(.v*  +  ad) .  Since  the  set  of  ajl  optimal 
solutions  of  a  convex  program  is  convex,  the  lattei  implies  /"(.v*)  -  /"( x*  +  ad)  for  all 
0  <  <*  ^  <7,  i.e.,  d  €  /2,(.v*).  Thus  (/satisfies  ( At).  (B)  and  (C),  which  is  impossible.  There¬ 
fore  v*  is  the  unique  optimum.  The  necessity  follows  by  a  similar  argument. 

□ 


3.  OPTIMALITY  CONDITIONS  FOR  STRICTLY  CONVEX  FUNCTIONS 
IN  THEIR  ACTUAL  VARIABLES 

This  section  can  be  skipped  without  hindering  the  study  of  Section  4. 

In  order  to  state  our  next  result,  which  is  a  characterization  of  optimality  for  a  subclass  of 
convex  functions,  i.e.  strictly  convex  functions  in  their  "actual  variables",  we  adopt  some 
notions  from  111. 

For  every  A  €  P  and  t  €  f*,  denote  by  [Al(r)  (read  "block  A"),  the  following  index  sub¬ 
set  of  P: ./  €  (A](r)  if,  and  only  if,  yk\  R  —  R.  defined  by 

y*U)  A  (.v, . v,  .v,f| . -v„) 

is  not  a  constant  function  for  some  fixed  .vt . v,  .v, , , . x„.  Thus,  for  a  given  r  €  Tk, 

[A )(r)  is  the  set  of  indices  of  those  variables  on  which  /*(-.r)  actually  depends.  These  "actual 
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variables"  determine  the  vector  xm,,,,  obtained  from  x  -  (X| . xn)'  by  deleting  the  vari¬ 

ables  (x,:  j  4  (fcl(/)),  without  changing  the  order  of  the  remaining  ones.  Similarly,  we  denote 
by  Rc*,d  1x1  —  R  the  restriction  of/*  to  /?t',,‘il* ’. 

DEFINITION  2:  A  function  /*:  R"  x  7”*  —  R  is  strictly  convex  in  its  actual  variables  if 
for  every  /  6  T*  its  restriction  / *"'*(■./ )  is  strictly  convex. 

The  above  concept  will  be  illustrated  by  an  example. 


EXAMPLE 3:  Consider 

/'(x,/)  *  x2  +  /x2.  t  €  T  =  [0. 11. 

Note  that  function  /'(-./)  is  not  strictly  convex  for  every  /  €  T.  Here 

|  (ll  if  t  =  0 
~  {  (1.2J  if  /  €  (0.11. 

(.v,)  if  /  =  0 

X,ll<',=  jjc,|  if  /  €  (0. 11 


and 

I  .vf  if  /  -  0 
rtUi/i  _  { 

'  \  X\  +  tx  1  if  /  €  (0. 1), 

clearly  a  strictly  convex  function  in  its  actual  variables  for  every  /  €  T.  Hence,  /'  is  a  strictly 
convex  function  in  its  actual  variables. 


COROLLARY  4:  Let  x*  be  a  feasible  solution  of  problem  (P>.  where  /*(•./).  k  6  P*  are 
strictly  convex  in  their  actual  variables  and  have  the  uniform  mean  value  property.  Then  .v  is 
an  optimal  solution  of  (P)  if.  and  only  if.  for  every  a*  >  0  and  every  subset  11*  C  77  the 


system 

(A) 

c/'V/tv*)  <  0 

(B.ll) 

d'Vfk(x *  +  a  *</./)  <  0  for  all  /  €  P*\ll 

(C) 

+  L  for  a)1 ,  e 

/‘(x*./)  « 

(D,ll ) 

</|*l,,i  «  0  for  all  /  €  11*. 

k  6  P* 

is  inconsistent. 

PROOF:  We  know,  by  Theorem  1.  that  a*  is  nonoptimal  if,  and  only  if.  there  exists 
n*  >  0  such  that  the  system  (A),  (B).  (C)  is  consistent.  In  order  to  prove  Corollary  4.  it  is 
enough  to  show  that  (B)  is  consistent  if,  and  only  if,  for  some  subsets  11*  C  77,  k  €  P*.  the 
system  (B.  11),  (D.  n )  is  consistent.  Suppose  that  (B)  holds.  For  every  k  €  P* define 

11*  A  (/  €  77:  (/'V/M.v*  +  ad.l)  -  0  for  all  0  <  «  ^  «*). 

Hence,  by  the  mean  value  theorem,  for  every  /  €  11* 

/*(*•  +  «</,/)  -  /*(**./)  for  all  0  <  a  ^  «  *. 
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Since  fk(j)  is  strictly  convex  in  its  actual  variables,  this  is  equivalent  to 
dlk |„>  =  0  for  all  /  €  11 k. 

If  i  €  rj\  11  then  obviously  +  ad,t)  <  0  for  some  0  <  a  <  a*,  by  (B).  Thus 

(B.  12 ),  (D,  II)  holds  for  =  Clk.  (Note  that  some  or  all  Hk's  may  be  empty.)  The  reverse 
statement  follows  from  the  observation  that  dlk |,()  -  0  implies  d'V f*(x*  +  a'd.t)  =  0. 

□ 

If  a  function  /*(•,/)  is  strictly  convex  (in  all  variables  xi . x„)  for  every  t  €  7*, 

A  €  P\  then  Dk(x*,t)  =  (0).  This  implies  that  the  system  (A),  (B.Q),  (C),  (D,fl)  is  incon¬ 
sistent  for  every  nonempty  (l*,  A  6  P*.  Thus  condition  (D.fi)  is  redundant.  In  fact,  condi¬ 
tion  (C)  is  also  redundant,  which  follows  by  the  following  lemma. 

LEMMA  1:  Let  flx.t)  be  convex  and  differentiable  in  x  €  R"  for  every  t  in  a  compact 
set  T  c  /{'and  continuous  in  /  for  every  .v.  If  for  some  d  €  R'\ 

(15)  </'V/(.vV)  <  0  for  all  /  €  T'  =  (/:  J{x*.t)  =  0} . 
then  there  exists  a  >  0  such  that 

(16)  /(.v*  +  ad,r)  <  0  for  all  t  €  T\T *. 

PROOF:  It  is  enough  to  show  that  the  hypothesis  (15)  and  the  negation  of  the  conclusion 

(16) .  which  is 

"For  every  a  >  0  there  is  /  —  /  (a)  €  T\T*such  that  fix’  +  ad,t(a))  >  0," 
are  not  simultaneously  satisfied.  If  this  were  true  one  would  have  the  following  situation: 

For  every  «„  of  the  sequence  a„  -  2'”  there  is  a  /„  -  /„(<*„)  €  T\T*  such  that 

(17)  fix'  +  «„</,/„(«„))  >  0.  /»  =  0.  1,  2.  ... 

Since  T is  compact.  | /„}  has  an  accumulation  point  /  €  T,  i.e.  there  is  a  convergent  subsequence 
{/„ )  with  /  as  its  limit  point.  We  discuss  separately  two  possibilities  and  arrive  at  contradictions 
in  each  case. 

CASE  I:  /  €  T*.  Since  f(x*,t)  =  0  and  </'V/(.v*,r)  <  0.  by  (15),  there  exists  a  >  0 
such  that 

(18)  fix'  +  ud.t)  <  0. 

For  all  large  values  of  index  i,  a„  <  a  and 

(19)  /(.vV„)  <  0. 
since  /„  €  T\T'.  This  implies 

(20)  fix*  +  ad.!,,)  >  0. 

(If  (20)  were  not  true,  one  would  have,  for  some  particular  /»,, 

(21)  fix'  +a </./„)<  0. 

Nowii,,  <  «,  (19),  (21)  and  the  convexity  of /imply 

fix'  +  <*„</./„)  ^  0  i 
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which  contradicts  (17).)  But  (18)  and  (20)  contradict  the  continuity  of  fix*  +  ad,  •). 

CASE  II:  i  f  T\Tm.  Since  <  0,  there  exists  a  >  0  such  that  (18)  holds,  by  the 

continuity  of  /(•./).  The  rest  of  the  proof  is  the  same  as  in  Case  I. 

□ 


A  characterization  of  optimality  for  strictly  convex  constraints  follows. 

COROLLARY  5:  Let  .v*  be  a  feasible  solution  of  problem  (P),  where  /*(•./)  are  strictly 
convex  for  every  /  €  7”*.  A  €  P*.  Then  v*  is  an  optimal  solution  of  (P)  if,  and  only  if.  for 
every  a  *  >  0  the  system 

(A)  d'vn.x')  <  o 

(B;)  </'V/H.v*,r)  <  0  for  all  t  6  T'k 

A  €  P* 


is  inconsistent. 

PROOF:  First  we  recall  that  /\  A  €  P*.  under  the  assumption  of  the  corollary,  have  the 
uniform  mean  value  property.  If  v*  is  not  optimal,  then  the  system  (A).  (B,) .  (C)  is  con¬ 
sistent,  by  Corollary  4.  This  implies  that  the  less  restrictive  system  (A),  (Bt)  is  consistent. 
Suppose  that  the  system  (A).  (B,)  is  consistent.  Then  for  every  A  €  P*  there  is  a*  >  0  such 
that 


/Ha*  +  akd.i)  0  for  all  t  €  7\\ T* 

by  Lemma  1  Let 

o  *  A  minjo^:  A  €  P*). 

By  the  convexity  of  /\  it  follows  that 

,/H.v*  +  a  V./)  ^  0  for  all  t  6  Tk\T\  ^  *  €  P* 

This  is  equivalent  to  (C)  of  Theorem  1  (see  (2-b)).  Therefore  the  system  (A).  (B,).  (C)  is 
consistent.  This  implies  that  the  system  (A),  (B),  (C)  is  consistent.  (The  reader  may  verify 
this  statement  by  the  technique  used  in  the  proof  of  Lemma  2.)  Hence  v*  is  optimal,  by  Corol¬ 
lary  4. 

n 


REMARK  2.  Differentiable  strictly  convex  (in  all  variables!)  functions  fK  do  have  the 
uniform  mean  value  property.  However,  this  is  not  necessarily  true  in  the  case  of  convex  func¬ 
tions  with  strictly  convex  restrictions.  In  particular,  function 


/(.V|..V;.f) 


A r  +  tv,(A,  -  n  "  v-  < 

Af  “I-  - 7  (a ■>  “  t  +  l)(.Vi  —  1)  if  A»  / 

(2  -  n-  ‘ 


1 

V 

1 


is  differentiable  and  has  strictly  convex  restrictions  for  every  t  €  i0, 1].  Note  that 

(ill  if  t  -  0 
”  (  |1.2|  if  /  6  (0.  11. 
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But  function  t  vines  not  have  the  uniform  mean  value  property.  One  can  show,  however,  that  a 
differentiable  function  which  is  strictly  convex  in  its  actual  variables  and  such  that  Ul(r)  is 
constant  over  all  compact  set  F.  does  have  the  mean  value  property. 

4.  PROGRAMS  WITH  UNIFORMLY  DECREASING  CONSTRAINTS 

The  applicability  of  Theorem  1  is,  in  general,  obscured  by  the  appearance  of  parameter  «  * 
in  conditions  (B)  and  (CL  The  purpose  of  this  section  is  to  point  out  some  of  the  topological 
difficulties  which  arise  in  the  removing  of  <i  *  from  condition  (B).  A  class  of  convex  functions 
for  which  the  optimality  conditions  can  be  stated  without  reference  to  <*  *  in  condition  (B)  will 
be  called  the  uniformly  decreasing  functions. 

In  what  follows  we  assume  that  1R"  x  T  —  R  is  convex  and  differentiable  in  .v  €  R"  for 
every  /  of  a  compact  set  Fin  R Further,  V/(v*./l  denotes  V/,(.v*./). 

DEFINITION  .V  Let  /:  R  x  T  —  R  and  v*  fc  R  be  such  that  F*  *  If.  Then  for  a 
gi\en  it  t  R  ,  it  -=  0,  the  function  /  is  uniformly  decreasing  at  v*  in  the  direction  it.  if  (i)  the 
set 

Slx'.it)  A  (/  €  r*:  «fV/U*.r>  <  0| 

is  compact  and  if  (ii)  there  exists  «  >  0  such  that  /(.v*  +  a  J.i)  *»  0  for  all  t  €  F*  for  which 
,/  t  /)  ( v  *.  r ) . 

It  is  not  easy  to  recogni/e  whether  a  general  convex  function  /  is  uniformly  decreasing. 

[■'XAXfPl.l'  4:  Consider  the  following  functions  from  R  x  R  into  R 
f'(x.i)  =  rl(.v  -  r):  -  r'l,  t  t  F  (used  in  Example  1) 
r(x.i)  =  v '  -  ix.  i  t  F 
/'(.v./)  =  -ix.  i  f  F. 

These  functions  are  all  convex,  r  is  actually  strictly  convex  and  /'  linear  in  ,v  for  every  i  €  F 
If  F  =  (0.  1 1,  then  neither  function  is  uniformly  decreasing  at  v*  *  0  in  the  direction  </=■=/ 
However,  if  I  -  1 1.2]  then  all  three  functions  are  uniformly  decreasing  at  v*  =  0  in  the  same 
direction  it  -  1. 

As  suggested  by  the  above  example,  a  convex  function  /  is  uniformly  decreasing  at  v*  in 
the  direction  J  0,  whenever  V/(.v*.  •)  is  continuous  and  the  set 

F  (v*.,/)  A  {i  t  F*:  c/’V/f.v'.r)  -  0| 

is  empty.  Its  complement 

S(x\,l)  =  F*\/(.v*</»  =  F* 

is  then  compact.  In  particular,  all  analytic  functions  not  identically  zero  are  uniformly  decreas¬ 
ing.  However,  a  characterization  of  optimality  for  problem  (P)  with  such  constraint  functions 
is  already  given  by  Corollary  4. 

An  important  uniformity  property  of  convex  functions  with  compact  Six*. it)  follows. 

LEMMA  2.  Let  FLv./)  be  convex  and  differentiable  in  v,  for  every  t  in  a  compact  set 
F  c  R and  continuous  in  i.  for  every  v  €  R".  Suppose  further  that  for  some  v*  and  d  *  0 
in  R ",  the  set  S(x’.d)  is  nonempty  and  compact.  Then  there  exists  o  >  0  such  that 
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(22)  /(.v*  +  <></./)  <0.  0  <  a  s*  a 

for  all  t  €  Six'.ii) 

PROOF:  Suppose  that  such  <1  >  0  does  not  exist  Then  there  exists  a  sequence 


U1  c  S(A 

i .*.</)  and  a 

sequence  |o  1.  <»,  —  o 

(r )  >  0  such  that 

/(.V* 

+ 

1 

O 

/(.v* 

+ 

a  d.l )  <  0.  0  <  a 

<  «, 

and 

(23) 

./(.V* 

+ 

A 

© 

A 

with  inf  1«  }  =>  0.  Since  S(.v\</)  is  compact.  (;)  contains  a  convergent  subsequence  j r, ).  Let 
i  t  S(.x'.J)  be  the  limit  point  of  (/  }.  Now 

</  V/(.v*.r)  <  0 

implies  that  there  exists  <>  >  0  such  that 

t  (v  *  +  <  0.  0  <  o  ^  o. 

In  particular. 

(24)  /(.v*  +  <*</./)  <  0. 

For  any  <  >  0  there  exists  ./„  ** ./,,(«)  such  that 

(25)  \i  -  r\  <  (  and  o  <  <i  for  all ./  > 

Now  (23)  and  (25)  imply 

(26)  fix’  +  iid.r  )  >  0  for  all  j  > 

But  the  inequalities  (24)  and  (26)  contradict  the  continuity  of  ,/(.v*  +  di/,  ). 


EXAMPLE  5:  Consider  again 

/•’(.v./)  -  v:  -  ix.  r  t  T  -  (1,21. 

This  function  is  uniformly  decreasing  at  v*  -  0  in  the  direction  </  -  1.  The  inequality  (22) 
holds  for  every  0  <  <7  <  I.  in  particular  <7  «  y.  If  the  above  interval  T  is  replaced  by 

f  -  (0. 11,  then  f:  is  not  uniformly  decreasing  at  v*  -  0  with  J  —  1.  An  a  >  0  satisfying  (22) 
here  does  not  exist. 

A  characterization  of  optimality  for  programs  (P).  with  constraint  functions  which  have 
the  uniform  mean  value  property  and  are  uniformly  decreasing,  follows. 

THEORFM  3:  Let  v*  be  a  feasible  solution  of  problem  (P).  where  /\  k  €  P*  have  the 
uniform  mean  value  property  Suppose  also  that  k  t  P*  are  uniformly  decreasing  at  v*  in 
every  feasible  direction  </  Then  \*  is  an  optimal  solution  of  (P)  if.  and  only  if.  for  every 
a  •  >  0  the  system 

(A)  J 'V.P'(.v')  <  0. 

</  V/H.v*./)  <  0  or  «/  t  /\(.v*.r) 
for  aii  i  t  r; 


i 
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(C) 


d'V/k(x*  +  a  *d,t)  >  _  J_ 
fk(x*,t)  '  a* 

for  all  t  6  Tk\T*k, 

k  6  P* 


is  inconsistent. 


PROOF:  Parts  (A)  and  (C)  are  proved  as  in  the  case  of  Theorem  1.  It  is  left  to  show 
that  the  existence  of  a  >  0  satisfying  (2-a)  is  equivalent  to  the  consistency  of  (B4).  It  is  clear 
that  (2-a)  implies  (B4).  In  order  to  show  that  (B4)  implies  (2-a)  we  use  the  assumption  that  the 
functions  \fk(x,i):  k  €  P *)  are  uniformly  decreasing  at  x*in  the  direction  d.  When  (B.0  holds, 
then  for  every  k  €  P"  there  exist  ak  >  0  and  or*  >  0  such  that 

|  fk(x*  +  ad.i)  <  0,  0  <  a  s£  ak 

(2?)  I  for  all  t  €  Sk  A  (/  €  T*k\  rf'V/OrV)  <  0), 

by  Lemma  2,  and 

fk(x*  +  ad,t)  =  0.  0  <  a  <  a‘k 

(28)  for  all  /  6  T*\Sk. 

since  d  ^  0.  The  latter  follows  by  part  (ii)  of  Definition  2  and  the  convexity  of  fk.  Let 

(29)  a  A  minlai.a/1)  >  0. 

=  kiP' 

Clearly  (27)  and  (28)  can  be  written  as  the  single  statement  (2-a)  with  a  chosen  as  in  (29). 


□ 


The  following  example  shows  that  the  assumption  that  \fk(x,t):  k  €  P*)  be  uniformly 
decreasing  at  x*  cannot  be  omitted  in  Theorem  3. 


EXAMPLE  6:  Consider  the  program 
Min  f"(x)  =  -x 
s.t. 


f(x.t)  <  0,  for  all  r  €  T  =  (0, 1) 


where 


Ax.t) 


r(x  -  t)2  if  x  >  t 
0  .  <  t. 


The  feasible  set  consists  of  the  single  point  x *  -  0,  which  is  therefore  optimal.  One  can  verify, 
after  some  manipulation,  that  the  constraint  function  /  has  the  uniform  mean  value  property  at 

x*  (For  every  a  >  0  there  exists  0  <  a  <  ya  such  that  (MV)  holds.)  However,  /  is  not 

uniformly  decreasing  at  x*.  In  order  to  demonstrate  that  Theorem  3  here  fails,  first  we  note 
i Hat  Tm  -  T  -  [0. 11,  so  the  condition  (C)  is  redundant.  Since  d  -  1  is  in  the  cone  of  direc¬ 
tions  of  constancy  D(x*.i )  for  every  t  €  (0,1),  we  conclude  that  the  system  (A),  (B4)  is  here 
•nsisicni.  contrary  to  the  statement  of  the  theorem.  Therefore  the  assumption  that  the  con- 
•  i '  nm  i  um  none  be  uniformly  decreasing. cannot  be  omitted  in  Theorem  3. 


"If* 


:4A 


.  -a 
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5.  THE  FRITZ  JOHN  AND  KUHN-TUCKER  THEORIES  FOR 
SEMI-INFINITE  PROGRAMMING 

In  contrast  to  the  characterizations  of  optimality  stated  in  the  preceding  sections  we  will 
now  recall  the  Fritz  John  and  Kuhn-Tucker  theories  for  semi-infinite  programming.  In  the 
sequel  we  use  the  following  concept  from  the  duality  theory  of  semi-infinite  programming,  e.g. 
13], 


DEFINITION  4:  Let  /  be  an  arbitrary  index  set,  (p':  /  €  / }  a  collection  of  vectors  in  Rm 
and  (c,:  /  6  /}  a  collection  of  scalars.  The  linear  inequality  system 

u'p '  <  c,.  for  all  i  €  / 

is  canonically  closed  if  the  set  of  coefficients  {((/>')',  c,):  i  €  /)  is  compact  in  R'"  +  i  and  there 
exists  a  point  u°  such  that 

( u")'p'  <  c,,  for  all  i  €  /. 

We  will  say  that  problem  (P)  is  canonically  closed  at  x*if  the  system 
(B5)  d'Vfk(x\i)  <  0  for  all  /  6  T*k.  k  €  P* 

is  canonically  closed. 

REMARK  3:  All  constraint  functions  of  problem  (P)  can  have  the  uniform  mean  value 
property,  or  they  can  be  uniformly  decreasing,  without  problem  (P)  being  canonically  closed. 

Lemma  3  below  is  a  specialized  version  of  Theorem  3  from  [3],  adjusted  to  our  need.  It 
is  related  to  the  following  pair  of  the  semi-infinite  linear  programs: 


(I) 

(II) 

Inf  u’p" 

Sup  £  c,A, 

/€  / 

S.t. 

u'p'  ^  c,.  all  /  €  / 

£  A,  -  p" 

it  1 

u  €  Rm 

\  €  S,  X  ^  0, 

where  S  is  the  vector  space  of  ail  vectors  {A,:  /  €  /}  with  only  finitely  many  nonzero  entries. 
Denote  by  V't  and  F„  the  optimal  values  of  (I)  and  (II),  respectively. 

LEMMA  3  Assume  that  the  linear  inequality  system  of  problem  (I)  is  canonically  closed. 
If  the  feasible  ;et  of  problem  (I)  is  nonempty  and  V\  is  finite,  then  problem  (II)  is  consistent 
and  V ,|  -  Vv  Moreover,  FM  is  a  maximum. 

The  concept  of  a  canonically  closed  system  is  used  in  the  proof  of  the  dual  statement  of 
the  following  theorem. 

THEOREM  4:  ("The  Fritz  John  Necessity  Theorem")  Let  x*  be  an  optimal  solution  of 
problem  (P).  Then  the  system 

(A)  <J'V/"(x*)  <  0 
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(B,)  </'V/*(.v*.r)  <  0  for  all  /  €  Tr, 

k  e  p * 

is  inconsistent  or,  dually,  the  system 

\"V.r (x*)  +  £  £  A,*V/*(aV)  =  0 

/up*  it  rj 

(FJ)  X ",  j\ r  €  T*.  k  €  P*}  nonnegative  scalars, 

not  all  zero  and  of  which  only  finitely  many  are  positive 

is  consistent. 

PROOF:  If  v*  is  optimal,  then  the  inconsistency  of  the  system  (A),  (B|)  is  well-known, 
e  g.  14,  Lemma  11.  In  order  to  prove  the  dual  statement,  we  note  that  the  inconsistency  of 
(A),  IB,)  is  equivalent  to  n  *  =  0  being  the  optimal  value  of  the  semi-infinite  linear  program 

(i)  Min  n 

s.t. 

</' V/"(.v*)  +m  5*  0 

</'V/(.v*.r)  Ss  0.  all  t  €  T*.  k  €  P * 


The  dual  of  ( 7 >  is 
(1*1) 

Max  0 
s.t. 

A  "V /"(.v*)  +  £  £  A,V/*(.vV)  =  0 

At/**/ *  r \ 

I  I  A,*-  I. 

A ,A  ^  0,  only  finitely  many  are  positive. 

The  feasible  set  of  problem  (I)  is  clearly  nonempty  and  canonically  closed  (d  =  0,  n  —  1  satisfy 
the  constraints  of  (I)  with  strict  inequalities).  Lemma  3  is  now  readily  applicable  to  the  pair 
(i),  (ID,  which  proves  (FJ). 


The  dual  statement  in  Theorem  4  is  the  Fritz  John  optimality  condition  for  semi-infinite  pro¬ 
gramming.  For  an  equivalent  formulation  the  reader  is  referred  to  Gehner's  paper  [4]. 

Under  various  "constraint  qualifications"  such  as  Slater's  condition: 

3x  €  R"  3  ,/H.v.r)  <  0  for  all  Z  €  Tk.  k  €  P 

or  the  "Constraint  Qualification  U"  of  Gehner  14],  one  can  set  A  „  —  1  in  Theorem  4.  In  fact, 
the  same  is  possible  if  problem  (P)  is  canonically  closed  at  v*.  i.e.  if  there  exists  tfsuch  that 
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• ...  .  ■=  ■■■  tmm 


(30)  d'VjHx'.t)  <  0  for  all  r  €  T*k.  k  €  P*. 

This  is  easily  seen  by  multiplying  the  equation  in  (FJ)  by  d  satisfying  (30).  Note  that  the 
canonical  closedness  assumption  is  a  semi-infinite  version  of  the  Arrow-Hurwicz-Uzawa  con¬ 
straint  qualification,  e.g.  [12).  The  latter  constraint  qualification  is  implied  by  Slater’s  condition. 

The  Fritz  John  condition  (FJ)  with  A„  =  1  is  a  semi-infinite  version  of  the  Kuhn-Tucker 
condition,  e  g.  |12).  While  the  Fritz  John  condition  is  necessary  but  not  sufficient,  the  Kuhn- 
Tucker  condition  is  sufficient  but  not  necessary  for  optimality.  If  a  constraint  qualification  is 
assumed,  then  the  Kuhn-Tucker  condition  is  both  necessary  and  sufficient  for  optimality  for 
problem  ( P ).  If  a  constraint  qualification  is  not  satisfied  then  tne  Fritz  John  condition  fails  to 
establish  the  optimality  and  the  Kuhn-Tucker  condition  fails  to  establish  the  nonoptimality  of  a 
feasible  point  x*.  In  contrast,  our  results  are  applicable.  This  will  be  demonstrated  by  two 
examples.  (See  also  an  example,  taken  from  approximation  theory,  in  Section  6.) 

EXAMPLE! :  Consider  the  semi-infinite  convex  problem 
Min  f"  =  X|  —  x2 
s.t. 

/'  *  x,2  +  tx2  -  i2  <  0  for  all  t  6  T,  =  [0, 1] 

/2  =  -x,  -  lx2  -  i  K  0  for  all  /  €  T2  -  [0, 1). 

The  feasible  set  is 

f-||*°,r14 '!<o 

and  the  optimal  solution  is  x*  =  (0,0)'.  For  this  point 
77-  n=  (01.  P*  =  {1,2). 

The  system  (Bs)  is 

0  <  0 
-d,  <  0, 

obviously  not  canonically  closed.  The  Kuhn-Tucker  condition  is 

I  |o]  -1  |o 

-1  +xi  |  o)  0  “  |o 

A,  S*  0,  A2  Js  0 
which  clearly  fails. 

One  can  easily  verify  that  the  constraint  functions  /’  and  / 2  have  the  uniform  mean  value 
property.  Also,  these  functions  are  uniformly  decreasing  at  x*  -  0  in  every  direction  d  &  0. 
(The  sets  T\  and  T\  are  singletons!)  Thus  Theorem  3  is  applicable.  Conditions  (A).  (B4)  and 
(C)  are  here 

d,-  d2  <  0 

10  <  0  or  dt  -  0,  d2  €  R 

-</,  ^  0 


(A) 

(B4) 
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(C) 


a*d\  +  id  2 


-P 


for  all  /  €  (0.1) 

/v  * 


— - —  >  -  -L  for  all  /  €  (0, 1). 

—  /  a* 


This  reduces  to 


(31) 


-  0,  d2  >  0 

—  ^ - ^  for  all  /  €  (0,1) 

—t  a * 


Since  d2  >  0,  the  inequality  (31)  cannot  hold  for  any  a*  >  0.  Hence,  by  Theorem  3, 
x*  =  (0, 0)'  is  optimal.  The  optimality  of  a  feasible  point  is  thus  established  here  using 
Theorem  3  and  not  by  the  Kuhn-Tucker  condition  which  here  fails. 


Consider  now  the  point  x*  =  (0,-1)'.  Here 
77-  (0).  77-  (0, 1),  P*  =  {1,2}. 


It  is  easy  to  verify  that  the  Fritz  John  condition  is  satisfied  in  spite  of  the  fact  that  x *  is  not 
optimal.  Conditions  (A),  (B)  and  (C)  are  here 


(A) 

(B) 

(C) 


<y,  -  d2  <  o 


o^o 

-d\  -  rd2<  0  for  all  /  6  {0, 1) 


a  *d,2  +  ft/, 


^  - 


for  all  /  €  (0, 1). 


For  a*  =  1,  these  conditions  are  satisfied  by  d\  =  0,  d2=  1.  Hence,  by  Theorem  1,  the  point 
,y*  =  (0, 1 )'  is  not  optimal.  Both  the  Fritz  John  and  the  Kuhn-Tucker  theories  fail  to  character¬ 
ize  optimality  in  this  example  because  a  constraint  qualification  (or  a  regularization  condition, 
e.g.  [ID  is  not  here  satisfied. 


Although  the  Fritz  John  and  Kuhn-Tucker  theories  fail  to  characterize  optimality,  they 
can  be  used  to  formulate,  respectively,  either  the  necessary  or  the  sufficient  conditions  of 
optimality. 


In  the  remainder  of  the  section  we  will  show  that  the  ordinary  Kuhn-Tucker  condition 
(i.e.  the  (FJ)  condition  with  A„  -  1)  can  be  weakened  by  assuming  an  asymptotic  form.  For  a 
related  discussion  in  Banach  spaces  the  reader  is  referred  to  [16). 

THEOREM  5:  ("The  Kuhn-Tucker  Sufficiency  Theorem")  Let  x*  be  a  feasible  solution  of 
problem  (P).  Then  x*  is  optimal  if  the  system 

(A)  </'V/’(x*)  <  0 

(B,)  </'V/*(.vV)  0  for  all  t  €  T*k. 

k  6  P * 


is  inconsistent  or,  dually,  if  the  system 
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V/°(x*)  +  £  £  X*V/*(jC*,f)  -  o 

kiP*  tiT\ 

(K-T)  |X,*:  /  €  T*k,  k  €/**}  nonnegative  scalars 

of  which  only  finitely  many  are  positive 

is  consistent. 

PROOF:  If  the  system  (A),  (Bs)  is  inconsistent,  so  is  (A),  (B).  (Recall  that 
Dk(x*,t)  C  (d:  d'V/k(x*,t)  =  0}.)  Hence,  in  particular,  the  system  (A),  (B),  (C)  is  incon¬ 
sistent.  Following  the  proof  of  Theorem  1,  one  concludes  that  x*  is  optimal.  The  inconsistency 
of  (A),  (B5)  is  equivalent  to  the  consistency  of  ( K -  T ),  by  e.g,  [11,  Corollary  5], 

□ 


;  REMARK  4:  The  "asymptotic"  form  of  the  Kuhn-Tucker  conditions  (K-T)  gives  a 

weaker  sufficient  condition  for  optimality  than  the  familiar  (i.e.,  without  the  closure)  condition 

f  V f°(x*)  +  2  Z  A,V/*(jcV)  -  0 

>  kiP-itrk 

(  (K-T)  (x,*:  /  €  7*.  k  €  P*}  nonnegative  scalars 

of  which  only  finitely  many  are  positive. 

In  some  situations  the  primal  Kuhn-Tucker  conditions  (A),  (BO  may  be  easier  to  apply 
than  (K-T).  This  will  be  illustrated  on  the  following  problem  taken  from  [8,  Example  2.4). 

EXAMPLE 8:  Consider 

Min  J"  =  4.v,  +  y  (x4  +  xh) 


s.t. 

/'  =  -X,  -  i ,x2  -  I2X)  -  I?x4  -  ~  '2X*  +  3  -  (t I  -  i2)2(i I  +  t2)2  <  0 

for  all  ter,-  JJjj:  -1  <  t,  ^  1.  /  -  I.2J. 

We  will  show,  using  the  Kuhn-Tucker  theory,  that  x*  —  (3,0, 0,0, 0,0)'  is  an  optimal  solution. 
The  optimality  of  v*  has  been  established  in  [8]  by  a  different  approach. 

First  note  that  here 

n'K  j:,,",2“0Or,|  +  ,2“0  0  r,‘ 

The  system  (A),  (B0  becomes 

(A)  4d\  +y  dA  +y  db<0 

(B0  —  d\  —  t\d2  —  t2dy  —  i\dk  —  (\t2dk  —  t\d^  ^  0 

for  all  /  €  77- 
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Substitute  in  (B5)  the  following  five  points  of  T*\ 


This  gives 

-d\  <0 

—d\  —  d]  —  d}  —  CI4  —  d$  ~  df,  ^  0 
~d\  ~  d2+  d)-  d^+  ds~  db  <  0 

— d\  +  di  —  dj  •-  dj  +  d$  —  df,  ^  0 

-</,  +  d2  +  dy  -  d4-  </5  -  db  <  0. 

Multiply  the  first  inequality  by  ten  thirds  and  each  of  the  remaining  four  inequalities  by  one 
sixth  then  add  all  five  inequalities.  We  gel 

4</j  dA  -}<4<o 

which  contradicts  (A).  Thus  the  system  (A),  (B5)  is  inconsistent  and  x*=  (3,0,0,0,0,0)'  is 
optimal,  by  Theorem  5. 

Theorems  1  and  3  suggest  that  the  presently  used  constraint  qualifications  for  semi¬ 
infinite  programming  problems  are  too  restrictive  because  they  do  not  employ  the  topological 
properties  of  problem  (P),  such  as  the  uniform  mean  value  property  or  the  uniformly  decreas¬ 
ing  constraints. 

6.  AN  APPLICATION  TO  CHEBYSHEV  APPROXIMATION 

It  is  well-known  that  there  is  a  close  connection  between  convex  programming  and 
approximation  theory,  e  g.  [5), [13].  In  fact,  many  approximation  problems  can  be  formulated 
as  convex  semi-infinite  programming  problems  in  which  case  the  results  of  this  paper  are 
readily  applicable.  In  particular,  the  problem  of  linear  Chebyshev  approximation  subject  to  side 
constraints 

(MM) 

n 

Min  max  I/O)  -  Y,  *,#,0)1 

'<T 

s.t. 

/(/)  <  £  xtg,li )  ^  u(r)  for  ail  1  €  T 

i- 1 

is  equivalent  to  the  linear  semi-infinite  programming  problem 
(L) 

Min  x„+1 
s.t. 

n 

~Xn+\  <  £  X,g,U)  -  fit)  <  xn  + 1 

for  all  /  6  T.  > 

Ht)  <  £  x,g,U)  mO) 


434 


A  bi:n-tal.  l  . kerznfr.  and  s  /lobhc 


Corollary  3  of  this  paper  can  be  applied  to  (L)  and  it  gives  a  characterization  of  the  best  approx¬ 
imation  for  the  problem  (MM).  Uniqueness  of  the  best  approximation  can  be  checked  using 
Theorem  2.  Rather  than  going  into  details  we  will  illustrate  this  application  by  an  example. 

EXAMPLE  9:  The  approximation  problem  stated  in  this  example  is  taken  from  (4),  see 
also  (15).  It  shows  that  there  exist  situations  when  the  Kuhn-Tucker  theory  for  semi-infinite 
programming  fails  to  establish  optimum  even  in  the  case  of  linear  constraints.  However,  the 
optimality  is  established  using  the  results  of  this  paper. 


The  linear  Chebyshev  approximation  problem  is 

Mini  max  |r4  —  a>— a->/|| 
l  (€|0.  Il  J 

s.t. 

—  t  ^  .V|  +  x2t  <  I2,  for  all  t  €  [0. 1]. 

An  equivalent  linear  semi-infinite  programming  problem  is 
Min  r  = 


Is  A* 


/'  =  r4  -  a,  -  aj r  -  v,  0 
/•2  =-/•*  +  X|  +  A}/  -  A,  ^  0 
/•’  =  - 1 2  +  A,  +  ,\y  <  0 

f  -  -  t  -  A,  -  ,\y  <0 


for  all  t  €  [0.1). 


(0.0. 1)'  optimal? 


Here  7?  =  (l).  7'*  =  <t ,  7?=  |0),  TJ=  [0|  and  P*  =  [1.3. 4}.  The  system  (A),  (B5)  is 
(A)  dy  <  0 

-</,  -  <i2  -  dy  <  0 

(B<)  </,  ^  0 

-d\  <  0 

and  it  is  clearly  consistent  (set  e.g.  dt  =  0,  d2  =  I,  dy  =  -  I).  Therefore,  Theorem  5  cannot  be 
applied.  (Since  the  system  ( K-T )  is  inconsistent,  a*  =  (0,0.1)'  is  not  a  "Kuhn-Tucker 
point"  )  But  the  system 


(A) 

<B2) 

(C,) 


dy  <  0 


- d i  ~d2~ 

dy  <C 

0 

d\ 

0 

- d , 

< 

0 

-d,  -  d2t 

-  dy 

;> 

-1. 

r4-  1 

d  |  +  d2l 

-1. 

- 1 2 

d\  +  d2l 

> 

-1. 

I 


for  all  i  €  [0. 1) 
for  all  i  €  (0,1) 

for  all  t  €  (0. 1) 
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is  inconsistent.  (First,  d |  -  0,  by  the  last  two  inequalities  in  (B2).  Now  (A)  and  (B2)  imply 
d2  >  0.  This  contradicts  d2  <  0  obtained  from  the  second  inequality  in  (Ci).)  Therefore 
.v*  =  (0,0,  D'  is  optimal,  by  Corollary  1. 
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SOLVING  INCREMENTAL  QUANTITY  DISCOUNTED 
TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING 
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ABSTRACT 

Logistics  managers  often  encounter  incremental  quantity  discounts  when 
choosing  the  best  transportation  mode  to  use.  This  could  occur  when  there  is  a 
choice  of  road,  rail,  or  water  modes  to  move  freight  from  a  set  of  supply  points 
to  various  destinations.  The  selection  of  mode  depends  upon  the  amount  to  be 
moved  and  the  costs,  both  continuous  and  fixed,  associated  with  each  mode. 
This  can  be  modeled  as  a  transportation  problem  with  a  piecewise-linear  objec¬ 
tive  function.  In  this  paper,  we  present  a  vertex  ranking  algorithm  to  solve  the 
incremental  quantity  discounted  transportation  problem.  Computational  results 
for  various  test  problems  are  presented  and  discussed. 


1.  INTRODUCTION 

Whenever  a  logistics  manager  is  making  a  decision  about  the  movement  of  freight,  he  is 
often  faced  with  choosing  from  among  different  modes  of  transportation.  Movement  of  freight 
by  air  or  motor  express  may  involve  no  fixed  costs  to  the  transporter,  but  will  usually  involve 
relatively  higher  variable  costs  than  either  rail  or  water.  However,  both  rail  and  water  can 
involve  the  investment  of  large  sums  for  rail  sidings  or  docking  facilities.  The  problem  of 
selecting  freight  modes  can  be  modeled  as  a  transportation  problem  with  a  piecewise-linear 
objective  function.  This  problem  has  been  termed  the  incremental  quantity  discounted  tran¬ 
sportation  problem,  since  it  is  assumed  that  the  variable  costs  decrease  as  the  amount  shipped 
increases.  This  comes  about  due  to  the  lower  variable  costs  for  rail  or  water  modes  relative  to 
air  or  road  freight  costs.  The  presence  of  fixed  costs  for  the  use  of  rail  or  water  determines  the 
range  of  shipment  levels  over  which  each  cost  will  be  applicable.  Figure  1  shows  this  type  of 
objective  function. 

In  this  paper  we  will  present  a  vertex  ranking  algorithm  to  solve  this  type  problem  along 
with  the  computational  results  from  various  sizes  and  types  of  problems.  Background  material 
is  discussed  in  Section  2,  while  the  details  of  the  algorithm  are  given  in  Section  3.  An  example 
is  worked  out  in  Section  4  while  Section  5  gives  computational  results. 

2.  BACKGROUND  MATERIAL 

The  incremental  quantity  discounted  transportation  problem  is  a  member  of  a  general 
class  of  math  programming  problems,  i.e.,  the  piecewise-linear  programming  problem.  Vogt 
and  Even  [15)  considered  the  case  of  the  piecewise-linear  transportation  problem  derived  from 
U.S.  freight  rates.  This  problem  is  neither  convex  nor  concave,  and  has  sections  of  the  objec¬ 
tive  function  which  are  flat  or  "free."  Figure  2  shows  this  case.  Vogt  and  Evans  used  separable 
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Figure  1 


nonconvex  programming  to  reach  an  approximately  optimal  solution  to  this  problem.  Balachan- 
dran  and  Perry  [1]  consider  another  version  of  this  problem  which  they  termed  the  all  unit 
quantity  discount  transportation  problem.  The  main  difference  between  this  and  the  previous 
case  is  the  lack  of  the  flat  section  of  the  objective  function.  The  latter  case  is  typical  of  some 
foreign  freight  rates,  and  is  shown  in  Figure  3  below. 

Problems  similar  to  this  one  have  been  mentioned  in  the  plant  location  literature,  e  g., 
Townsend  (14],  and  Efroymson  and  Ray  [5],  In  these  cases,  it  is  suggested  that  the  problem  be 
solved  by  considering  multiple  plants,  one  for  each  range  of  demand. 

Balachandran  and  Perry  presented  a  branch  and  bound  algorithm  for  the  all  unit  quantity 
discount  problem,  which  they  show,  will  also  work  for  the  incremental  quantity  discount  prob¬ 
lem  as  well  as  fixed  charge  transportation  problems.  However,  no  computational  results  are 
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given  to  demonstrate  the  efficiency  of  this  algorithm.  Here,  we  will  consider  a  vertex  ranking 
algorithm  for  only  the  incremental  quantity  discount  problem  for  two  reasons.  First,  fixed 
charge  transportation  problems  have  been  handled  in  several  other  places  in  a  manner  that  has 
been  shown  to  be  superior  to  vertex  ranking  12.8).  Secondly,  the  incremental  quantity  discount 
transportation  problem  has  a  concave  objective  function,  but,  neither  the  problem  considered 
by  Vogt  and  Evans,  or  the  all  unit  quantity  discount  transportation  problem,  have  nonconcave 
objective  functions.  This  is  crucial  to  the  use  of  vertex  ranking  since  this  procedure  will  only 
consider  vertices  of  the  constraint  set,  and  the  optimal  solution  to  problems  with  nonconcave 
objective  functions  need  not  occur  at  a  vertex. 

The  incremental  quantity  discount  problem  may  be  formulated  as  follows  (following  the 
model  proposed  by  Balachandran  and  Perry  |l]): 


(1) 

Min  7. 

II  ll 

(2) 

subject  to  £  ,v„  -  a,  for  i  €  / 

(3) 

X  ”  fy  for  j €  J 

(4) 

v  II 

ifX”  ^  <  V, 

0; 

>- 

/A 

A 

Ck  -  i 

v  it 

v  ^ 

if  a;/1  <  ,v(,  <  K  =5 
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jl  if  A  *  1  v„  <  X‘ 
v"  “  |0  otherwise 

ft  -  £  C  Ui  -  x;  ')  -  C‘X‘  1  for  A  -  2.3 . r 

\  1 

t]}  -  0.  x,  >  0  for  all  i  6  1  and  j  €  J. 

J  —  { 1 . n)  -  set  of  sinks. 

/  =  (1 . w)  —  set  of  sources. 

/?  «*■  (1 . r)  -  set  of  cost  intervals. 

As  may  be  easily  seen,  this  is  a  generalization  of  the  fixed  charge  transportation  problem, 
(see  HP.  with  a  fixed  charge,  ./*.  and  a  continuous  cost.  C‘.  for  each  range  of  shipment 
between  source  i  and  destination  j.  Since  the  situation  which  we  are  attempting  to  model,  i.e.. 
the  choice  of  shipment  mode,  does  involve  various  levels  of  fixed  charge.  (1)  -  (7)  is  the 
proper  formulation  for  this  problem.  It  should  be  noted  that  we  are  implicitly  assuming  that 

O  >  O  * 1  for  all  /'€  /.  ,/  t  J. 

This  is  necessary  for  the  concavity  of  the  objective  function.  However,  we  would  expect  that 
lower  continuous  costs  would  occur  for  higher  shipment  levels. 

Balachandran  and  Perry  (1|  suggested  that  (0  -  (7)  may  be  solved  by  a  branch  and  bound 
algorithm.  Their  procedure  is  similar  to  that  used  to  solve  travelling  salesmen  problems  by- 
driving  out  subtours  (13).  They  solve  the  transportation  problem  with  all  costs  set  to  their 
lowest  value,  i.e..  (.”.  If  any  routes  have  flow  below  X  'r branching  is  done  on  one  of  these 
variables.  Two  branches  are  used.  Our  branch  forces  the  flow  over  the  arc  above  the  lower 
limit  for  the  cost  level  l'.  i.e..  V  >  X  ’  '.  In  the  other  branch,  the  infeasible  cost.  Cl, .  is 
replaced  by  the  feasible  cost.  Ct.  This  continues  until  a  solution  is  found  where  the  arc  flows 
match  the  costs  used.  This  is  the  optimal  solution.  However,  the  effectiveness  of  the  pro¬ 
cedure  is  unknown  since  the  authors  did  not  provide  any  computational  results. 

It  would  also  appear  that  the  work  of  kcnningion  18]  on  the  fixed  charge  transportation 
problem  could  possibly  be  modified  to  solve  this  problem  by  having  multiple  arcs  between  each 
set  of  nodes.  Each  arc  would  be  bounded  by  X.k  'and  X*  with  multiple  continuous  costs  and 
fixed  costs.  However,  this  would  lead  to  effectively  larger  problems,  e  g.,  a  problem  with  60 
arcs  and  five  breakpoints  would  have  300  variables  in  the  new  problem. 

3.  SOU.  TION  PROCEDURE 

l 'sing  the  formulation  of  the  incremental  quantity  discount  transportation  problem  given 
in  (D  -  (7),  along  with  the  assumption  of  decreasing  costs,  we  have  a  problem  with  linear  con¬ 
straints  and  concave  objective  function.  It  is  well  known  |7)  that  an  optimal  solution  for  prob¬ 
lems  of  this  type  will  occur  at  a  vertex  of  the  constraint  set.  Examples  of  other  problems  that 
share  this  condition  are  the  fixed  charge  problem,  the  quadratic  transportation  problem,  and  the 
quadratic  assignment  problem.  Mum  1 1 21  was  the  first  to  suggest  a  vertex  ranking  scheme  for 
a  problem  of  this  category.  He  showed  that  the  fixed  charge  problem  could  be  solved  by  rank¬ 
ing  the  vertices  of  the  constraint  according  to  the  objective  value  up  to  some  upper  bound.  At 
that  point,  the  optimal  solution  would  be  found  at  one  or  more  of  the  ranked  vertices. 


IS) 

(6) 

and 

(7) 

where 


SOI  V  IV.  IK  WSI’OKl  Alias  I’KOHI  I  MS  Bl  VIRUS  RWKIVi 


We  may  formulate  any  problem  with  concave  objective  function  and  linear  constraint  as 
below. 

(8)  Min  fix) 

(0)  s.t.  v€  S 

(10)  where  5  =  {.vl.-l.v  =  h.x  S  0). 

Since  no  "direct"  optimisation  techniques  exist  for  the  case  where  /(.v)  is  nonlinear,  we 
shall  look  at  a  procedure  for  searching  the  vertices  of  S.  To  do  this,  we  will  use  a  linear 
underapproximation  of  fix),  say  L(.\).  such  that  /.  (v)  <  fix).  x€S.  In  this  case,  to  show 
that  v*  is  an  optimal  solution  to  (8'  -  (10).  we  need  only  rank  the  vertices  of  S  until  a  vertex  of 
v"  is  found  such  that  1.  (  vc )  ^/(.v*).  At  this  point,  all  vertices  that  could  possibly  be  optimal 
have  been  ranked.  This  is  proved  by  Cabot  and  Francis  [31. 

In  order  to  rank  the  extreme  points  of  .S',  we  need  to  use  a  result  also  first  proved  by 
Murty  as  Theorem  1  below: 

THEOREM  1:  If  F\.  t':  ....  are  the  lirst  K  vertices  of  a  linear  underapproximation 
problem  which  are  ranked  in  nondecreasing  order  according  to  their  objective  value,  then  ver¬ 
tex  ,  i  must  be  adjacent  to  one  of  £t.  F:  ....  . 

Simply  put,  this  says  that  vertex  3  will  be  adjacent  to  the  optimal  solution  to  the  linear 
underapproximation  and  vertex  3  will  be  adjacent  to  vertex  1.  or  vertex  2.  and  so  on.  This, 
then,  gives  us  a  procedure  for  ranking  the  vertices  if  all  adjacent  vertices  can  be  found.  It  is 
this  "it”  that  can  cause  problems.  These  problems  arise  due  to  the  possibility  of  degeneracy  in 
.S'.  If  A  is  degenerate,  then  there  may  exist  multiple  bases  for  the  same  vertex.  This  implies 
that  all  such  bases  must  be  available  before  one  can  be  sure  that  all  adjacent  vertices  have  been 
found.  Finding  all  such  bases  for  finding  and  "scanning"  all  adjacent  vertices  can  be  quiie 
cumbersome.  However,  a  recent  application  of  Chernikova's  work  l4.d]  has  been  shown  to  be 
a  way  around  the  problem  of  degeneracy. 

Vertex  ranking  has  been  used  by  McKeown  [10]  to  solve  fixed  charge  problems  and 
Eluhariy  [hi  to  quadratic  assignment  problems.  Cabot  and  Francis  [3]  also  proposed  the  use  of 
vertex  ranking  to  solve  a  certain  class  of  nonconvex  quadratic  programming  problems,  eg., 
quadratic  transportation  problems.  For  a  survey  of  vertex  ranking  procedures,  see  HU. 

In  our  problem,  we  need  to  determine  the  linear  underapproximation  to  the  first  objective 
function.  (U.  We  may  do  this  by  first  noting  that 
(ID  /a =  min  (</,.  M 

is  an  upper  bound  on  v  .  We  may  then  note  that  if  Fix  )  -  t/  x  +  /S  '  then 

Fin,  )  -  HO) 

/  1 


o  «„  +  n 


for A  '  1  Z  it,,  /  A',  is  a  linear  underapproximation  to  Fix,,). 
We  may  now  form  a  problem  to  rank  vertices,  i.e.. 

I1JI  Mi"'-  Il|~~  I,  -Hi- 


subject  to  (2)  -  (7) 


for  A  *  1  ^  ii,,  <  A 
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Using  (13)  and  (2)  -  (7),  we  may  rank  vertices  as  discussed  earlier  until  some  vertex  x° 
is  found  such  that  L(x°)  “  II  4*  ^  /( x*)  where  x*  is  a  candidate  for  optimality.  We 

<  j 

may  start  with  x*  equal  to  the  optimal  solution  to  (13)  and  (2)  -  (7),  and  then  update  it  as  new, 
possibly  better  solutions  to  (1)  -  (7),  are  found  by  the  ranking  procedure.  When  all  vertices  x 
are  found  such  that  I  Or)  <  /Or*),  the  solution  procedure  terminates  with  the  present  candi¬ 
date  being  optimal. 

EXAMPLE :  As  an  example  of  our  procedure,  we  will  solve  an  incremental  quantity 
discount  version  of  the  example  problem  presented  by  Balachandran  and  Perry  (1).  Table  1 
below  gives  the  supplies,  demands,  and  costs,  for  each  range  of  shipment.  Table  2  gives  the 
optimal  solution  to  the  linear  underapproximation  problem.  The  values  of  /„  are  given  in  the 
upper  right-hand  corner  of  each  cell  with  shipment  being  circled  in  the  basic  cells. 


TABLE  1 


N,  Desiina- 

lion 

Source  \ 

1 

2 

3 

4 

Warehouse 

Capacity 

1 

3(20<xn<oo] 

4U0«X|,<20] 

5(0«JC|,<I0] 

6llO<.V|2<°°l 
7(5<x,2<  101 
8[0^x,2<5) 

3l27<x,j<«o| 
4{I5<x,,<27) 
5(5<xi3<  151 

One  price  bracket  4 

80 

1 

1 

2 

One  price 
bracket  6 

5(65<x22<oo] 

6I20<x22<651 

8{0<X22<201 

8ltO«X23«°o] 
915^X2, <101 
10(0<x2,<5) 

One  price  bracket  1 5 

i 

90  ! 

1 

1 

3 

lW^Xj^oo) 

2[20«*J1<27] 

3(0«*3,<201 

3(60<x32<e,ol 

4(30«x32<60| 

5|0<x,2<30| 

10[20$X3j^oo| 
U[10sSxj,<201 
12[0«x„<  101 

5[30<Xj4<oo] 

6(20«xj4<30] 

7[0<x,4<201 

! 

55 

j 

Market 

Demand  S 

70 

60 

35 

60 

1 

TABLE  2 


''"\Destina- 

^\tion 

Source 

— 

1 

2 

3 

4 

Warehouse 

Capacity 

1 

3.431 

6.251 

4.201 

4.001 

80 

® 

© 

2 

6.001 

6.67] 

8.431 

15.001 

90 

© 

© 

© 

3 

1.851 

4.551 

10.861 

5.91  1 

55 

© 

Market 

Demands 

70 

60 

35 

60 

i 
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As  an  example  of  the  calculation  of  the  /„  values,  we  will  look  at  /| |-  First,  it  is  necessary 
to  calculate  f\\  and  f}t  using  (6).  We  will  do  f\\. 

f\ i  “  C,1,  (A,,  -x|’|)  —  C|J|  -A  j't 

-  (5) (10)  -  4(10)  -  10. 

Similarly,  /,3,  =  30.  uu  =  min  {80, 70|  -  70.  Then,  /„  -  — (^j-  --3-  -  3.43. 

Now,  if  we  solve  this  continuous  transportation  problem,  we  get  a  value  of  Z  -  1042.20 
with  the  circled  cells  being  basic.  If  we  compute  the  feasible  value  for  this  solution,  Z  -  1067. 
Call  this  solution  X1. 

Now,  since  this  solution  is  nondegenerate,  we  may  use  simplex  pivoting  to  look  at  each 
nonbasic  cell.  The  values  of  these  adjacent  vertices  are  given  below: 


Vertex 

Z- Value 

(1,1) 

1067.10 

(1,2) 

1118.40 

(2,4) 

1143.75 

(3,2) 

1154.40 

(3,3) 

1141.05 

(3,4) 

1069.80 

Since  the  Z-value  for  each  vertex  is  greater  than  the  present  value  of  Z,  we  do  not  need  to  rank 
any  other  vertices,  and  Z=  1067.0  is  the  optimal  solution  value. 

4.  COMPUTATIONAL  RESULTS 

To  test  the  vertex  ranking  procedure  discussed  here,  randomly-generated  problems  were 
run.  These  problems  were  generated  by  first  generating  supplies  and  demands  uniformly 
between  upper  and  lower  bounds,  U  and  L.  These  supplies  and  demands  were  generated  so 
that  they  were  all  multiples  of  5.  This  was  done  to  insure  the  presence  of  degeneracy  in  some 
of  the  problems.  All  problems  were  set  up  to  have  discount  ranges  at  20,  50,  300,  1000,  and 
2000.  By  proper  selection  of  L  and  (/,  various  numbers  of  ranges  could  be  tested. 

The  costs  for  each  arc  were  generated  by  randomly  generating  mileages  between  each  set 
of  nodes,  and  then,  inputting  discount  cost-per-mile  values  for  each  range  of  flow,  e.g.,  10,  9, 
8,  etc.  The  final  discount  costs  were  found  by  multiplying  the  mileage  between  each  arc  times 
the  various  costs.  In  this  way,  various  supply-demand  discount  ranges  and  cost  configurations 
could  be  tested.  These  problems  were  generated  and  solved  using  a  computer  code  in  FOR¬ 
TRAN  run  on  the  CYBER  70/74  using  the  FTN  Compiler  with  OPT  —  1. 

The  problem  characteristics  and  test  results  are  given  in  Table  1.  The  second  column 
gives  the  solution  time  in  seconds,  while  the  third  column  shows  the  number  of  vertices  of  the 
linear  underapproximation  other  than  the  optimal  solution  that  were  ranked  to  solve  each  prob¬ 
lem.  The  fourth  column  gives  the  size  of  the  problem  (m  x  n)\  the  fifth  column  gives  the 
number  of  cost  ranges  that  the  arc  flows  would  cover;  the  sixth  column  gives  the  cost  per  mile 
for  each  range  of  flow,  pf/,  the  seventh  column  gives  the  lower  and  upper  ranges  used  to  gen¬ 
erate  the  supplies  and  demands;  and  finally,  the  last  column  gives  the  ranges  used  to  generate 
mileages.  The  Cfj  values  were  determined  by  Cj)  -  pjj  (mileage).  As  can  be  seen,  the  algo¬ 
rithm  successfully  solved  all  problems  tested.  The  most  difficult  problems  were  those  with 
three  ranges  and  supplies/demands  between  5  and  100.  Problems  6  and  13  are  identical,  except 
that  6  is  over  only  3  ranges,  while  13  is  over  5;  but,  problem  13  is  solved  in  much  less  time.  In 
fact,  the  linear  underapproximating  transportation  problem  was  found  to  be  optimal  and  no 
other  extreme  points  were  even  ranked.  This  was  also  the  case  in  problems  7,  9,  10,  11,  and 
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12,  even  though  the  number  of  variables  increased  markedly.  It  is  also  interesting  to  note  the 
effect  of  costs  in  problems  5,  6,  and  7.  These  are  essentially  the  same  problem,  but  with  the 
present  decrease  in  cost  for  increasing  flow  being  less  in  each  case.  The  results  are  as  expected 
since  in  problem  7  the  linear  underapproximation  will  be  closer  to  the  actual  objective  function 
than  in  problems  4  and  5. 


TABLE  3  —  Computational  Results 


Problem 

Number 

Vertices 

Ranked 

Solution 

Time 

mxfl 

Number  of 
Ranges 

p!j 

— 

U.L 

Mileage 

Range 

l 

13 

2.604 

6x8 

3 

10,9,8 

1,50 

100,200 

2 

39 

10.289 

8  x  8 

3 

10,9,8 

1,50 

100,200 

3 

39 

34.103 

9x9 

3 

10,9,8 

1,50 

100,200 

4 

257 

42.938 

6x8 

3 

10,9,8 

1,100 

100,200 

5 

247 

39.799 

6x8 

3 

20,18,17 

1,100 

100,200 

6 

84 

13.353 

6x8 

3 

20,19,18 

1,100 

100,200 

7 

0 

.121 

4x6 

5 

20,19,18,17,16 

400,500 

SO,  100 

8 

18 

.888 

4x8 

5 

20,19,18,17,16 

400,500 

50,100 

9 

0 

.196 

6x8 

5 

20,19,18,17,16 

400,500 

50,100 

10 

0 

.393 

8x8 

5 

20,19,18,17,16 

400,500 

50,100 

11 

0 

.518 

9x9 

5 

20,19,18,17,16 

400,500 

50,100 

12 

0 

.130 

4x6 

5 

10,9,8,7.6 

400,500 

50,100 

13 

0 

.213 

6x8 

5 

20,19,18,17,16 

400,500 

100,200 

It  would  appear  from  these  results  that  vertex  ranking  does  hold  promise  as  a  solution 
procedure  for  incremental  cost  discount  transportation  problems.  Neither  size  of  problem  nor 
degeneracy  appears  to  have  any  effect  on  solution  time  but  cost  patterns  and  number  of  cost 
ranges  do  seem  to  have  a  marked  effect. 

Extensions  of  this  work  could  be  used  to  solve  other  concave  linear  programming  prob¬ 
lems.  Walker  (161  discusses  the  fact  that  these  can  be  considered  as  generalizations  of  fixed 
charge  problems.  The  main  difference  would  be  that  the  first  linear  portion  would  have  a  posi¬ 
tive  fixed  charge  rather  than  zero,  as  in  the  problem  discussed  here.  However,  this  would  not 
change  the  approach  to  the  solution  used  here. 
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AUXILIARY  PROCEDURES  FOR  SOLVING  LONG 
TRANSPORTATION  PROBLEMS 


J.  Intrator  and  M.  Berrebi 

Rar-Uan  University 
Rantat-Gan,  Israel 

ABSTRACT 

An  efficient  auxiliary  algorithm  for  solving  transportation  problems,  based 
on  a  necessary  but  not  sufficient  condition  for  optimum,  is  presented. 


In  this  paper  a  necessary  (but  not  sufficient)  condition  for  a  given  feasible  solution  to  a 
transportation  problem  to  be  optimal  is  established,  and  a  special  algorithm  for  finding  solutions 
which  satisfy  this  condition  is  adapted  as  an  auxiliary  procedure  for  the  MODI  method. 

Experimental  results  presented  show  that  finding  an  initial  solution  which  satisfies  this 
necessary  condition  for  problems  with  m  <  <  n  eliminates  70%-90%  of  the  MODI  iterations. 
(See  Table  1) 


TABLE  1  —  Matrix  of  Principal  Results 


m  > 

20 

30 

40 

50 

100 

200 

300 

4 

0.65 

0.69 

0.72 

0.74 

0.88 

0.91 

0.93 

5 

0.61 

0.67 

0.69 

0.71 

0.84 

0.87 

0.90 

6 

0.59 

0.65 

0.66 

0.68 

0.80 

0.82 

0.85 

8 

0.61 

0.62 

0.64 

0.66 

0.76 

0.80 

0.82 

10 

0.57 

0.65 

0.66 

0.69 

0.73 

0.77 

0.80 

20 

0.25 

0.27 

0.31 

0.36 

0.45 

0.50 

0.52 

Fraction  of  Modi  iteration  eliminated  by  using  the  method  presented 
in  this  paper. 


The  case  when  our  algorithm  is  used  during  the  solution  process  (especially  for  m  —  n)  is 
presently  being  examined.  Our  auxiliary  procedure  requires  relatively  little  computational  effort 
in  finding  the  appropriate  candidate  for  the  basis,  eliminating  entirely  the  need  to  calculate  the 
dual  variables.  It  works  with  positive  variables  associated  with  one  pair  of  rows  at  a  time  using 
only  the  prices  of  these  rows. 
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Once  a  loop  for  any  given  pair  of  rows  is  determined  it  may  be  used  to  insert  numerous 
non-basic  cells  in  these  two  rows  to  the  basis.  The  result  is  a  considerable  time  reduction  in 
determining  loops. 

The  storage  and  time  requirements  for  the  special  lists  needed  in  our  auxiliary  algorithm 
are  fully  discussed  in  [I],  A  rigorous  proof  presented  ip  (1]  shows  that  updating  these  lists 
requires  no  more  than  0  (m  log  n)  computer  operations  per  MODI  iteration. 

A  Linear  Programming  Transportation  Problem  is  characterized  by  a  cost  matrix  C  and 
two  positive  requirement  vectors  a  and  b  such  that  £  o,  =  £  bj.  The  problem  is  to  minimize 

II  CuXij  subject  to 

I  j 

I  X,  j  =  bj  j  =  1 , 2  ....  M 

i 

(A)  £  x0  -  a,  i  -  1,2  ...  m 

j 

xu  >  0  for  all  (ij). 

A  proper  perturbation  of  our  problem  ensures  that: 

(1)  each  feasible  basic  solution  of  (A)  contains  exactly  m  +  n  —  1  positive  variables  jCy, 

(2)  corresponding  to  each  nonbasic  cell  (i.j)  U„  -  0)  there  exists  a  unique  loop  of  different 
cells,  say  L(ij)  -  ( i.y , )  (/2J,)  (i2J2)  (i3,j2)  .... 

(B)  Ur.jr-i)  ( i,.j )  ( i.j ) 

which  contains  at  most  two  cells  in  each  row  and  column,  where  the  cell  (ij)  is  the  unique 
nonbasic  cell, 

(3)  there  are  no  loops  which  contain  basic  cells  only. 

Notation:  For  fixed  l,k  ( I  <  /  ^  k  <  m)  we  denote 
Vi  =  \j\xti  >  0)  l<y<« 

V/k  -  V/  n  vk  -  \j\xu  >  o,  Xkj  >  o). 

With  no  loss  of  generality  it  is  assumed  that  for  each  /,  (1  <  /  <  m)  there  exists  at  least  one 
destination  (column)  y  €  Vt  such  that  (I.j)  is  the  unique  basic  cell  of  column  j.  Otherwise,  an 
artificial  destination,  say  J  with  x,;  ~  6,  —  «  where  c  is  an  infinitely  small  positive  number  will 
be  introduced. 

It  is  easy  to  see  that  the  feasible  solution  of  the  augmented  problem  of  dimension 
mx(n  +  1)  satisfies  I),  2),  3)  mentioned  above. 

DEFINITION  1:  A  destination  with  a  unique  basic  cell  will  be  called  a  fundamental  desti¬ 
nation. 


The  unique  nonbasic  cell  (i.j)  of  LUj)  will  be  considered  for  convenience  to  be  the  last 
cell  of  L(i.j).  For  each  loop  L(i.j).  say  loop  (B),  we  introduce  the  notation: 

<C>  “  C',i\  -  ci;i,  +  C'ri;  -  C*  +  •  •  •  +  cu  -  C(/. 
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It  is  well  to  know  that 

Cluj »  “  ut  +  v j  -  Qj  where  w,  and  v,  are  the  dual  prices. 

DEFINITION  2:  A  loop  with  CL  >  0  is  called  an  improving  loop. 

DEFINITION  3:  Let  l.k  be  a  fixed  pair  of  numbers  so  that  1  <  /  ^  k  4  m,  we  define 
*  Ik  “  01  x,i  >  0,  xkj  -  0}  -  V,-  V,„ 

DikU)  *  Qj  -  CkJ  j  —  1,2 . n. 

THEOREM  1:  The  number  of  elements  in  Vlk  is  at  most  I. 

PROOF:  Suppose  that  y,,y2€  V,k  (1  <  y,  *  y2  <  „)  then  the  loop  U.J,)  (*7,)  ( k,J2 ) 
(/.y2)  is  of  only  basic  cells  contradicting  (3)  above. 

Let  J[  be  a  fundamental  destination  of  A/k,  the  purpose  of  Theorem  2  and  Theorem  3  is 
to  show  that  all  the  simplex  loop  L(i.J)  and  the  numbers  CUU)  i  -  l,k;  JtA,k  U  Ak,  are  deter¬ 
mined  after  the  simplex  loop  L  ( k,J )  is  found. 

THEOREM  2:  Cnkj})  —  D/k(J2)  —  Cnk,jy >  ~  Dtk(J\)  for  all  J2€A/k 
J i  being  the  above  fundamental  destination  of  A,k. 

PROOF: 

CASE  (a)  Vtk  &  <{>.  Denote  by  J  the  unique  member  of  V,k  (Th.l).  We  have  j  ^  Jx\ 
j  ^  J2  U t  and  J2&  Vk)  and 

L(kJ\)=  ( kj )  ( IJ )  (/,/,)  (A,/,) 

UkJt )-  (kj)  ( IJ )  U,J2)  (k,J2) 

eg- 

Qu.Vj)  -  D/k(J2)  —  Cnkjt)  -  Dik(Jt). 

CASE  (b)  Vlk  =  4>.  Let  L(k,J\)  be  the  loop  (B).  Note  that  /,  —  /  (since  column  J\  con¬ 
tains  a  basic  cell  in  row  /  exclusively)  and  r  >  2.  Otherwise,  /,  -  /2  -  /  and  L(k,Jx)  “  (k,j) 
Uj )  (/./ 1)  ik.J |)  means  that  y€  F/jk  contradicting  the  fact  that  V,k  - 

Consider  the  loop: 

L  “  (^>y’i)  OWi)  i'lJi)  iiyJi) . (Ur- 1)  (/V2)  (k,J2),  (/|  —  fr) 

obtained  from  £(*„/,)  by  substituting  y2  for  y,.  Let  us  show  that  either  Hk,J2)  —  L  or 
Hk,J2)  can  be  obtained  from  L  by  deleting  two  identical  cells. 

At  first,  observe  that  all  rows  and  columns  of  L  (except  perhaps  J2)  contain  exactly  two 
different  cells  of  L.  The  column  J2  has  not  appeared  previously  (unless  J2  *  y,_|)  because  it 
equals  one  of  the  previous  members  js ,  I  <  s  <  r  -  2,  then  the  loop  (/, ^Js) 
(W,*i)  •  •  •  (/,.y2)  will  be  a  loop  of  basic  cells  only  which  contradicts  (3). 

Thus,  only  two  possibilities  exist: 

(1)  y2  y,_|  and  Hk,J2)  -  Z,  or 


450 


J  tNTRATOR  AND  M.  BERREBI 


(2)  J2-  jr-\  and  Hk,J2)  is  obtained  from  L  by  deleting  the  two  identical  cells  U,jr-i)  and 
U.J2). 

Since  this  deleting  does  not  effect  the  value  of  Cuu,u  we  have  for  both  possibilities 
—  D,k(J2)  —  CL{kJi)  -  Dtk(J\). 

THEOREM  3:  Let  4  and  J2  be  the  destinations  defined  in  Theorem  2  and  Jj€Ak.  We 
shall  prove  that 

Q(/.y,>  “  -  lQu.y,t  ~  D,kU2)\  +  Dkt  (J}). 

PROOF:  Let  L  (k,J\)  be  the  loop  (B)  with  /,  -  /  because  J  is  fundamental. 

Consider  the  loop  L  defined  by 

L  -  (i„jr_ ,)  jf_,)  ...  (i2,j\)  ( i\,j\ )  (k,J3)  il.Jy). 

CASE  (a)  y/k  ^  <f>.  Same  proof  as  in  Theorem  2. 

CASE  (b)  y/k  -  4>.  By  the  same  argument  as  in  Theorem  2  we  can  show  that  there  are 
only  two  possibilities. 

1)  Jy*  j |  which  implies  that  L(I,J})  -  L. 

2)  Jy  -  j).  In  this  case  (r  >  2)  and  LU.Jy)  can  be  obtained  from  L  by  deleting  the  two  identi¬ 
cal  cells  U\J\)  and  (k,Jy). 

In  the  two  cases  we  have 
Qiu,i  =  ~  lQu.y,>  ~  DtkU\)]  +  Dk/Uy) 
and  by  Theorem  2  we  have 

Q</.y,>  “  ~  lQu.y,>  ~  DtkU2)]  +  Dk/Uy). 

THEOREM  4:  If  DtkU2 )  >  DtkUy)  then  either  Hk,J2)  or  L(k.Jy)  is  an  improving  sim¬ 
plex  loop. 

PROOF:  Since  D,kUy)  =  -  Dkl(Jy),  it  follows  from  Theorem  3  that 
+  Q(U,I  “  —  Dlk(Jy)  >  0 

( D/k(J2 )  >  D,k (Jy))  and  either  Q(Uj)  or  CL(kJ])  is  a  positive  number,  e.g.,  either  Hl.Jy)  or 
L  (k,J2)  is  an  improving  simplex  loop  (Definition  2). 

COROLLARY:  At  optimum  we  have  D,k(J2 )  <  D&Uy). 

DEFINITION  4:  Define  4  by 
Du <4>  “  max  D,k(J). 

i<  * i 
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REMARK  1:  We  shall  suppose  that  A*0‘|)  “  A* (/j)  if  and  only  if  j\  -  j2.  Otherwise, 
a  cost  perturbed  problem  with  Q  “  Q  +  e""'1'-'  can  be  considered  and 

AiO.)  -  AJOj)  -  Q,  -  cWl  +  €n/+yi  -  «"*+y*  -  C//2  + 

+  C*/2  -  tm,+'2  +  (mk+'2  which  for  sufficiently  small  c  >  0  is  equal  0  only  for  j\  -  j2. 

THEOREM  5:  If  at  the  optimality  V,k  —  <f>  then  DikUik )  <  Dik(Jkt )  (Jtk  ^  Jkt),  else 
Olk(Jlk)  =  Dlk (Jkt)  (Jlk  -  Jkl). 

PROOF:  If  Vlk  =  <t>  then  Dlk  (Jlk )  ^  D,k(Jki)  otherwise,  (by  cost-perturbation)  J!k  =■=  7*/ 
and  ^  <£. 

The  first  part  of  Theorem  5  follows  now  immediately  from  the  corollary  of  Theorem  4. 

If  V,k  <f>  and  j  is  the  unique  element  of  V/k  then  by  the  definition  of  J,k  and  from  j€  V, we 
have  D,k(J)  <  Dlk(J,k). 

Let  us  show  that  j  =-  Jlk.  Suppose  that  j  ^  Jtk  then  we  have  A* O’)  <  A k(Jik),  (Definition  4) 
and  the  simplex  loop 

L(k,Jik)  =  (kj)  ( l,j )  (l.J,k)  ( k,Jlk )  will  be  an  improving  simplex  loop  since 
Qiu,i  =  A/ 0 )  +  Aa (-A* )  “  A* <4 )  —  A* O’)  >  0 

contradicting  the  fact  that  we  have  optimality. 

Thus,  j  “  7ft.  By  the  same  argument  we  have  y  -  7*/. 

A  simple  algorithm  consists  of 

1)  Computing  the  differences  A*  (-4), 

2)  Comparing  D/(l U,k )  to  D,kUki). 

If  A* (7/*)  >  Aa(7a/)  (or  if  7ft  ^  7*/  for  non-empty  Frt.)  we  improve  our  solution,  using 
all  the  nonbasic  cells  ( l,J)  or  (k,J)  where  76  (V,UVk)  such  that  Dlk(jtk )  <  DtkU)  <  A* (7a/ ) 
by  searching  only  the  first  loop  involving  the  tows  /  and  k. 

The  other  loops  will  be  obtained  by  changing  the  last  two  cells  keeping  the  2k-2  first  cells 
in  the  same  order  or  in  the  opposite  order  (Theorem  2  and  Theorem  3). 

REMARK:  In  order  to  assure  that  the  first  loop  will  not  be  a  shortened  loop,  this  loop 
will  be  obtained  by  using  a  fundamental  artificial  destination  7  with  only  one  basic  cell  in  the  k 
row  with  xk ,  —  e. 

The  proposed  technique  was  applied  to  each  pair  of  rows  U,k)  until  Aa (7/a )  ^  Aa(7a/)  for 
all  I  <  /  ^  k  <  m.  At  that  point  the  MODI  method  was  implemented.  Performing  a  MODI 
iteration  frequently  caused  Aa  (7/a  )  >  A*  (7a/)  for  some  1  ^  I  ^  k  ^  m  which  would  enable 
further  utilization  of  the  proposed  technique.  However,  for  the  purpose  of  the  present  experi¬ 
ment  the  proposed  technique  was  not  reactivated  after  the  initial  processing.  (See  Table  1) 

The  storage  and  lime  requirements  of  the  lists  Jtk  when  updated  at  each  MODI  iteration 
are  fully  discussed  in  ()]. 
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One  possible  way  to  update  this  list  may  be  described  as  follows:  For  each  /  the  destina¬ 
tions  of  y  €  Vi  are  ordered  in  m  —  1  sequences  Pik  (1  <  k  ^  /  <  m)  of  increasing  Dik(j). 
Thus  Pik  -  l/ii  h  ■  •  • ;  Jn)  (Ni  —  the  number  of  elements  in  V,)  D,k(Ji )  <  A* C/2)  ••  • 
<  A*Oa()  (equality  excluded  because  of  the  supposed  cost-perturbation).  These  Ptk  sequences 
are  organized  in  heaps.  Adding  or  deleting  an  item  from  a  heap  requires  0(tog  Nt)  <  0(log  n) 
computer  operations.  Since  at  each  simplex  iteration  only  one  basic  cell,  say  ( <t,t ),  becomes 
nonbasic  and  one  nonbasic  cell,  say  (s,r),  becomes  basic,  we  have  to  update  2(m  -  1)  heaps 
(P^p  and  Psr  for  all  p  ^  a,  r  ^  s),  which  amounts  to  0 (m  log  n)  computer  operations  per 
simplex  iteration,  (heaps,  see  [2]). 
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ABSTRACT 

In  this  paper  we  address  ihe  question  of  deriving  deep  cuts  for  nonconvex 
disjunctive  programs.  These  problems  include  logical  constraints  which  restrict 
the  variables  to  at  least  one  of  a  finite  number  of  constraint  sets  Based  on  the 
works  of  Balas,  Glover,  and  Jcroslow.  we  examine  Ihe  set  of  valid  inequalities 
or  cuts  which  one  max  derive  in  this  context,  and  defining  reasonable  criteria 
to  measure  depth  of  a  cut  we  demonstrate  how  one  may  obtain  the  “deepest” 
cut.  The  analysis  covers  the  case  where  each  constraint  set  in  the  logical  state¬ 
ment  has  only  one  constraint  and  is  also  extended  for  the  case  where  each  of 
these  constraint  sets  may  have  more  than  one  constraint. 


1.  INTRODUCTION 

A  Disjunctive  Program  is  an  optimization  problem  where  the  constraints  represent  logical 
conditions.  In  this  study  we  are  concerned  with  such  conditions  expressed  as  linear  constraints. 
Several  well-known  problems  can  be  posed  as  disjunctive  programs,  including  the  zero-one 
integer  programs.  The  logical  conditions  may  include  conjunctive  statements,  disjunctive  state¬ 
ments,  negation  and  implication  as  discussed  in  detail  by  Balas  [1,2],  However,  an  implication 
can  be  restated  as  a  disjunction,  and  conjunctions  and  negations  lead  to  a  polyhedral  constraint 
set.  Thus,  this  study  deals  with  the  harder  problem  involving  disjunctive  restrictions  which  are 
essentially  nonconvex  problems. 

It  is  interesting  to  note  that  disjunctive  programming  provides  a  powerful  unifying  theory 
for  cutting  plane  methodologies.  The  approach  taken  by  Balas  [2]  and  Jeroslow  [14]  is  to 
characterize  all  valid  cutting  planes  for  disjunctive  programs.  As  such,  it  naturally  leads  to  a 
statement  which  subsumes  prior  efforts  at  presenting  an  unified  theory  using  convex  sets,  polar 
sets  and  level  sets  of  gauge  functions  [1,2,5,6,8,13,141.  On  the  other  hand,  the  approach  taken 
by  Glover  [10)  is  to  characterize  all  valid  cutting  planes  through  relaxations  of  the  original  dis¬ 
junctive  program.  Constraints  are  added  sequentially,  and  when  all  the  constraints  are  con¬ 
sidered  Glover’s,  result  is  equivalent  to  that  of  Balas  and  Jeroslow.  Glover’s  approach  is  a  con¬ 
structive  procedure  for  generating  valid  cuts,  and  may  prove  useful  algorithmically. 

The  principal  thrust  of  the  methodologies  of  disjunctive  programming  is  the  generation  of 
cutting  planes  based  on  the  linear  logical  disjunctive  conditions  in  order  to  solve  the 
corresponding  nonconvex  problem.  Such  methods  have  been  discussed  severally  by  Balas 
[1,2,3],  Glover  [8],  Glover,  Klingman  and  Stutz  [111,  Jeroslow  [14]  and  briefly  by  Owen  [17]. 
But  the  most  fundamental  and  important  result  of  disjunctive  programming  has  been  stated  by 
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Balas  [1,2]  and  Jeroslow  [14],  and  in  a  different  context  by  Glover  [10],  It  unifies  and  sub¬ 
sumes  several  earlier  statements  made  by  other  authors  and  is  restated  below.  This  result  not 
only  provides  a  basis  for  unifying  cutting  plane  theory,  but  also  provides  a  different  perspective 
for  examining  this  theory.  In  order  to  state  this  result,  we  will  need  to  use  the  following  nota¬ 
tion  and  terminology. 


Consider  the  linear  inequality  systems  Sh,  /;€// given  by 
(1.1)  Sh  =  [jc:  A"x  >  b",  x  ^0],  /?€// 


where  H  is  an  appropriate  index  set.  We  may  state  a  disjunction  in  terms  of  the  sets  Sh ,  /»€// 
as  a  condition  which  asserts  that  a  feasible  point  must  satisfy  at  least  one  of  the  constraint  S/n 
h€H.  Notationally,  we  imply  by  such  a  disjunction,  the  restriction  x€  U  Sh.  Based  on  this 

h£H 


disjunction,  an  inequality  n'x  >  n0  will  be  considered  a  valid  inequality  or  a  valid  disjunctive  cur 
if  it  is  satisfied  for  each  x€  U  Sh.  (The  superscript  /  will  throughout  be  taken  to  denote  the 

htH 

transpose  operation).  Finally,  for  a  set  of  vectors  [v'1:  //  €  A/} ,  where  v1'  *=  (vf . v*)  for 

each  /;€//,  we  will  denote  by  sup  (v*),  the  pointwise  supremum  v=  (v, . v„)  of  the  vec- 


tors  v*,  /;€//,  such  that  v,  =  suj>  [v/]  for  j  —  1,  . ..  ,  n. 


Before  proceeding,  we  note  that  a  condition  which  asserts  that  a  feasible  point  must  satisfy 
at  least  p  of  some  q  sets,  p  <  q,  may  be  easily  transformed  into  the  above  disjunctive  statement 
by  letting  each  Sh  denote  the  conjunction  of  the  q  original  sets  taken  p  at  a  time.  Thus,  H  = 

|l . (p)|  in  th's  case'  Now  cons‘der  that  following  result. 

THEOREM  1:  (Basic  Disjunctive  Cut  Principle)  —  Balas  [1,2],  Glover  [10],  Jeroslow 
[14] 


Suppose  that  we  are  given  the  linear  inequality  systems  S,,,  h €  H  of  Equation  (1.1),  where 
|  // 1  may  or  may  not  be  finite.  Further,  suppose  that  a  feasible  point  must  satisfy  at  least  one 
of  these  systems.  Then,  for  any  choice  of  nonnegative  vectors  A\  /?€//,  the  inequality 

(1.2)  (sup  (A") '/I")  x  >  inf  (A h)'bh 

[htH  J  hC  H 

is  a  valid  disjunctive  cut.  Furthermore,  if  every  system  Sh,  h€H  is  consistent,  and  if  \H\  < 

n 

°°,  then  for  any  valid  inequality  £  itjXj  ^  7r0,  there  exist  nonnegative  vectors  A\  /j€// such 

7-1 

that  ir0  <  inf  (A h)'bh  and  for  j  =  1 . n,  the  j  th  component  of  sup  (A'')'^''  docs  not 

hen 

exceed  i rr 

The  forward  part  of  the  above  theorem  was  originally  proved  by  Balas  [2]  and  the  con¬ 
verse  part  by  Jeroslow  [14],  This  theorem  has  also  been  independently  proved  by  Glover  [10] 
in  a  somewhat  different  setting.  The  theorem  merely  states  that  given  a  disjunction  x€  U  Sh, 

/;€  H 

one  may  generate  a  valid  cut  (1.2)  by  specifying  any  nonnegative  values  for  the  vectors  A\ 
h  €  H.  The  versatility  of  the  latter  choice  is  apparent  from  the  converse  which  asserts  that  so 
long  as  we  can  identify  and  delete  any  inconsistent  systems,  Sh ,  /?€//,  then  given  any  valid  cut 
n'x  ^  7T0,  we  may  generate  a  cut  of  the  type  (1.2)  by  suitably  selecting  values  for  the  parame¬ 
ters  A'',  /;€//  such  that  for  any  x  belonging  to  the  nonnegative  orthant  of  /?”,  if  (1.2)  holds 
then  we  must  have  n'x  >  n0.  In  other  words,  we  can  make  a  cut  of  the  type  (1.2)  uniformly 
dominate  any  given  valid  inequality  or  cut.  Thus,  any  valid  inequality  is  either  a  special  case  of 
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(1.2)  or  may  be  strictly  dominated  by  a  cut  of  type  (1.2).  In  this  connection,  we  draw  the 
reader's  attention  to  the  work  of  Balas  [11  in  which  several  convexity/intersection  cuts  dis¬ 
cussed  in  the  literature  are  recovered  from  the  fundamental  disjunctive  cut.  Note  that  since  the 
inequality  (1.2)  defines  a  closed  convex  set,  then  for  it  to  be  valid,  it  must  necessarily  contain 
the  polyhedral  set 

(1.3)  5  =  convex  hull  of  U  Sh. 

htH 

Hence,  one  may  deduce  that  a  desirable  deep  cut  would  be  a  facet  of  S,  or  at  least  would  sup¬ 
port  it.  Indeed,  Balas  [3]  has  shown  how  one  may  generate  with  some  difficulty  cuts  which 
contain  as  a  subset,  the  facets  of  5  when  |//|  <  °°.  Our  approach  to  developing  deep  disjunc¬ 
tive  cuts  will  bear  directly  on  Theorem  1.  Specifically,  we  will  be  indicating  how  one  may 
specify  values  for  parameters  A*  to  provide  supports  of  S,  and  will  discuss  some  specific  criteria 
for  choosing  among  supports.  We  will  be  devoting  our  attention  to  the  following  two  disjunc¬ 
tions  titled  DC1  and  DC2.  We  remark  that  most  disjunctive  statements  can  be  cast  in  the  for¬ 
mat  of  DC2.  Disjunction  DC1  is  a  special  case  of  disjunction  DC2,  and  is  discussed  first 
because  it  facilitates  our  presentation. 


DC1: 


Suppose  that  each  systems  S;,  is  comprised  of  a  single  linear  inequality,  that  is,  let 


(1.4) 


Si¬ 


x'-  It  au  xj  >  b'\- 

/-i 


for  /;€//=  [1, 


h) 


where  we  assume  that  /;  =  |//|  <  °°  and  that  each  inequality  in  Sh ,  /;€//  is  stated  with  the  ori¬ 
gin  as  the  current  point  at  which  the  disjunctive  cut  is  being  generated.  Then,  the  disjunctive 
statement  DC1  is  that  at  least  one  of  the  sets  Sh,  h€H  must  be  satisfied.  Since  the  current 
point  (origin)  does  not  satisfy  this  disjunction,  we  must  have  />('  >  0  for  each  /;€//.  Further, 
we  will  assume,  without  loss  of  generality,  that  for  each  /;€//,  a'{,  >  0  for  some 
j  €  (1 . n)  or  else,  Sh  is  inconsistent  and  we  may  disregard  it. 


DC2: 


Suppose  each  system  Sh  is  comprised  of  a  set  of  linear  inequalities,  that  is,  let 


(1.5) 


Si,  = 


x:  X  a!ix.i  ^  */'  f°r  each  /€£),,,  x  ^  ol  for  h 6 H  =  (1 


i- 1 


*) 


where  Q,n  li€H  are  appropriate  constraint  index  sets.  Again,  we  assume  that  It  =  |//|  <  °° 
and  that  the  representation  in  (1.5)  is  with  respect  to  the  current  point  as  the  origin.  Then,  the 
disjunctive  statement  DC2  is  that  at  least  one  of  the  sets  SA,  /;€//  must  be  satisfied.  Although 
it  is  not  necessary  here  for  b /'  >  0  for  all  /  €  Qh  one  may  still  slate  a  valid  disjunction  by  delet¬ 
ing  all  constraints  with  b/'  <  0,  i  €  Qh  from  each  set  Sh ,  h  €  H.  Clearly  a  valid  cut  for  the 
relaxed  constraint  set  is  valid  for  the  original  constraint  set.  We  will  thus  obtain  a  cut  which 
possibly  is  not  as  strong  as  may  be  derived  from  the  original  constraints.  To  aid  in  our  develop¬ 
ment,  we  will  therefore  assume  henceforth  that  b['  >  0,  /  €(?/,,  /;€//. 


Before  proceeding  with  our  analysis,  let  us  briefly  comment  on  the  need  for  deep  cuts. 
Although  intuitively  desirable,  it  is  not  always  necessary  to  seek  a  deepest  cut.  For  example,  if 
one  is  using  cutting  planes  to  implicitly  search  a  feasible  region  of  discrete  points,  then  all  cuts 
which  delete  the  same  subset  of  this  discrete  region  may  be  equally  attractive  irrespective  of 
their  depth  relative  to  the  convex  hull  of  this  discrete  region.  Such  a  situation  arises,  for  exam¬ 
ple,  in  the  work  of  Majthay  and  Whinston  f!6).  On  the  other  hand,  if  one  is  confronted  with 
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the  problem  of  iteratively  exhausting  a  feasible  region  which  is  not  finite,  as  in  [20]  for  exam¬ 
ple,  then  indeed  deep  cuts  are  meaningful  and  desirable. 

2.  DEFINING  SUITABLE  CRITERIA  FOR  EVALUATING  THE  DEPTH  OF  A  CUT 

In  this  section,  we  will  lay  the  foundation  for  the  concepts  we  propose  to  use  in  deriving 
deep  cuts.  Specifically,  we  will  explore  the  following  two  criteria  for  deriving  a  deep  cut: 

(i)  Maximize  the  euclidean  distance  between  the  origin  and  the  nonnegative  region 
feasible  to  the  cutting  plane 

(ii)  Maximize  the  rectilinear  distance  between  the  origin  and  the  nonnegative  region 
feasible  to  the  cutting  plane. 

Let  us  briefly  discuss  the  choice  of  these  criteria.  Referring  to  Figure  1(a)  and  (b),  one 
may  observe  that  simply  attempting  to  maximize  the  euclidean  distance  from  the  origin  to  the 
cut  can  favor  weaker  over  strictly  stronger  cuts.  However,  since  one  is  only  interested  in  the 
subset  of  the  nonnegative  orthant  feasible  to  the  cuts,  the  choice  of  criterion  (i)  above  avoids 
such  anamolies.  Of  course,  as  Figure  1(b)  indicates,  it  is  possible  for  this  criterion  to  be  unable 
to  recognize  dominance,  and  treat  two  cuts  as  alternative  optimal  cuts  even  through  one  cut 
dominates  the  other. 

Let  us  now  proceed  to  characterize  the  euclidean  distance  from  the  origin  to  the  nonnega¬ 
tive  region  feasible  to  a  cut 

n 

(2-1)  X  2jxj  >  zo •  where  z0  >  0.  Zj  >  0  for  some  y€ { 1 . n). 

j- 1 

The  required  distance  is  clearly  given  by 

(2.2)  0e  =  minimum  (HatII:  £  zix,  >  z0,  x  >  0). 

j- ! 

Consider  the  following  result. 

LEMMA  1:  Let  9e  be  defined  by  Equations  (2.1)  and  (2.2).  Then 
(2  3)  9  -  Z° 

e  lb'll 

where, 

(2.4)  y  -  . j,„), 

yj  =  maximum  (0,  Zj],  j  —  1 . n. 


PROOF:  Note  that  the  solution  x*  • 


*0 


lb'll2 


y  is  feasible  to  the  problem  in  (2.2)  with 


H**ll  :  JjJf]"-  Moreover,  for  any  x  feasible  to  (2.2),  we  have,  z0  <  £  zjXj  <  f  yjXj  < 

J- >  i 

llvll  IUII  ,  or  that,  ||at  1 1  >  JjyjJ'  ^'s  completes  the  proof. 


Now,  let  us  consider  the  second  criterion.  The  motivation  for  this  criterion  is  similar  to 
that  for  the  first  criterion  and  moreover,  as  we  shall  see  below,  the  use  of  this  criterion  has 
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intuitive  appeal.  First  of  all,  given  a  cut  (2.1),  let  us  characterize  the  rectilinear  distance  from 
the  origin  to  the  nonnegative  region  feasible  to  this  cut.  This  distance  is  given  by 

fi  n 

(2.5)  9r  -  minimum  {|x|:  £  z,x,  >  z0>  x  >  0),  when  |x|  J  x,. 

/-i  /- i 

Consider  the  following  result. 


Figure  l.  Recognition  of  dominance 


LEMMA  2:  Let  9,  be  defined  by  Equations  (2.1)  and  (2.5).  Then, 

(2.6)  9,  -  —  where  rm  -  maximum  r.. 

2m  >“■ . » 


*0 


PROOF:  Note  that  the  solution  x*  —  (0,  ....  — ,  ...0),  with  the  m  th  component 

zm 

Z± 

m 


being  non-zero,  is  feasible  to  the  problem  in  (2.5)  with  |x*|  -  — .  Moreover,  for  any  x  feasi¬ 
ble  to  (2.5),  we  have. 


f  <£  f 


j- 1  zm 


This  completes  the  proof. 


Note  from  Equation  (2.6)  that  the  objective  of  maximizing  9,  is  equivalent  to  finding  a 
cut  which  maximizes  the  smallest  positive  intercept  made  on  any  axis.  Hence,  the  intuitive 
appeal  of  this  criterion. 


3.  DERIVING  DEEP  CUTS  FOR  DC1 


It  is  very  encouraging  to  note  that  for  the  disjunction  DC1  we  are  able  to  derive  a  cut 
which  not  only  simultaneously  satisfies  both  the  criterion  of  Section  2,  but  which  is  also  a  facet 
of  the  set  S  of  Equation  (1.3).  This  is  a  powerful  statement  since  all  valid  inequalities  are  given 
through  (1.2)  and  none  of  these  can  strictly  dominate  a  facet  of  5. 


•  *,•  • 


458  H  D  SHERALI  AND  C  M  SHHTTY 

We  will  find  it  more  convenient  to  state  our  results  if  we  normalize  the  linear  inequalities 
(1.4)  by  dividing  through  by  their  respective,  positive,  right-hand-sides.  Hence,  let  us  assume 
without  loss  of  generality  that 

(3.1)  Sh  =  x\  £  a'ijXj  >  1,  x  >  o|  for  /?€//«*  {1 . h). 

Then  the  application  of  Theorem  1  to  the  disjunction  DC1  yields  valid  cuts  of  the  form: 

(3.2)  £  jmax  A/'afi  Xj  >  min  {X  /') 


where  A/',  li£H  are  nonnegative  scalars.  Again,  there  is  no  loss  of  generality  in  assuming  that 

(3.3)  £  *i"=  1.  W'  5*0,  h€H  =  (1 . h } 

ItiH 

since  we  will  not  allow  all  A/',  /)€//  to  be  zero.  This  is  equivalent  to  normalizing  (3.2)  by 
dividing  through  by  £  A  ('■ 

hi  H 

Theorem  2  below  derives  two  cuts  of  the  type  (3.2),  both  of  which  simultaneously 
achieve  the  two  criteria  of  the  foregoing  section.  However,  the  second  cut  unilormly  dominates 
the  first  cut.  In  fact,  no  cut  can  strictly  dominate  the  second  cut  since  it  is  shown  to  be  a  facet 
of  S  defined  by  (1.3). 

THEOREM  2:  Consider  the  disjunctive  statement  DC1  where  S,,  is  defined  by  (3.1)  and  is 
assumed  to  be  consistent  for  each  /)€//.  Then  the  following  results  hold: 

(a)  Both  the  criteria  of  Section  2  are  satisfied  by  letting  A  /'  —  A  (’’  where 

(3.4)  A,"'  -  1/A  for  A  €  W 
in  inequality  (3.2)  to  obtain  the  cut 

n 

(3.5)  a],  Xj  ^  1.  where  a*,  =  max  a for  j  =  1 . n. 


(b)  Further,  defining 


(3.6)  y j'  =  minimum  la'j/a1,',}  >  0,  h£H 

>0 

and  letting  \\  =  A  /’**,  where 

(3.7)  AT-y/'/X  y?  for  h£H 

pi  H 

in  inequality  (3.2),  we  obtain  a  cut  of  the  form 

(3.8)  £  a’’  xi  ^  1.  where  a*’  —  max  a?,  y/Tor  j  -  1, 


which  again  satisfies  both  the  criteria  of  Section  2. 

(c)  The  cut  (3.8)  uniformly  dominates  the  cut  (3.5);  in  fact, 
'f  au  >  0 

(3.9)  a | ;  j  .  •  .,  •  .  ,  j  *  l .  n. 

a\,  if  a,,  4  0 


•  • 
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(d)  The  cut  (3.8)  is  a  facet  of  the  set  S  of  Equation  (1.3). 


PROOF: 


(a)  Clearly,  Af«  1/A,  h€H  leads  to  the  cut  (3.5)  from  (3.2).  Now  consider  the 
euclidean  distance  criterion  of  maximizing  0,(or  0*)  of  Equation  (2.3).  For  cut  (3.5),  the 
value  of  0 2  is  given  by 


(3.10)  (0<T)2-  l/x  (v/)2  >  0  where  yj  -  max{0,af,),  j  -  1 . n. 

Now,  for  any  choice  A/’,  A€  //, 

(3.11)  02  -  |min(Af)jy/£  yf  -  (Apy/£  y}.  say. 

where  y3  *=  max{0.max  A / ' <z } .  If  A{’“  0,  then  0,-0  and  noting  (3.10),  such  a  choice  of 

parameters  Af,  h€H  is  suboptimal.  Hence,  Af  >  0,  whence  (3.11)  becomes  0 2  -  1 
/  v  2  I  ' 

-^7 1  .  But  since  (A f/A f)  >  1  for  each  A€//,  we  get 


j'Ai 


max 

0,  max 

Af 

at 

hi  H 

lJ 

>  maxlO,  maxi 
1  h(H 


•yj- 


Thus  9}  <  (0f*)2  so  that  the  first  criterion  is  satisfied. 


Now  consider  the  maximization  of  0,  of  Equation  (2.5),  or  equivalently  Equation  (2.6). 
For  the  choice  (3.4),  the  value  of  0r  is  given  by 

1 


(3.12) 


0, 


>  0. 


max  a  Xj 

J 


Now,  for  any  choice  A/',  A€  //,  from  Equations  (2.6),  (3.2)  we  get 

0,  -  (minA/'l  /(max  max  Af  af.)  -  Af  /max  max  Af  af;,  say. 

\|/  l  /  UJ  y  j  ihh  1 

f  —  0  implies  a  value  of  0,  inferior  to  6'.  Thus,  assume  Af  >  0.  Then,  0r  — 

h  ) 

1  af,.  But  (A  ,7a  f)  ^  1  for  each  /?€//  and  in  evaluating  0r,  we  are  interested 


As  before,  A 

1/  max  max 

/  j  htH 


A* 


Af 


only  in  those  j€  { 1 . n)  for  which  a*.  >  0  for  some  h£H.  Thus,  0r  <  1/max  max  of. 

j  hi  H 

0,*,  so  that  the  second  criterion  is  also  satisfied.  This  proves  part  (a). 


(b)  and  (c).  First  of  all,  let  us  consider  the  values  taken  by  yf,  />€//.  Note  from  the 
assumption  of  consistency  that  yf,  A  €  //  are  well  defined.  From  (3.5),  (3.6),  we  must  have 
yf  >  1  for  each  A€  H.  Moreover,  if  we  define  from  (3.5) 

(3.13)  H *  -  {A  €  //:  a**  -  a'k  >  0  for  some  Ac  €  { 1 . «}} 

then  clearly  H **  {<*>}  and  for  A€//*,  Equation  (3.6)  implies  yf  <  1.  Thus, 


J 
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(3.14) 

ri'|>  1  for  h*H*. 

Hence, 

(3.15) 

min  y  i'  —  1 

/i€tf 
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or  that,  using  (3.7)  in  (3.2)  yields  a  cut  of  the  type  (3.8),  where, 
(3.16)  a  ,7-  max  j  -  1,  ....  a. 

h  €  H 


Now,  let  us  establish  relationship  (3.9).  Note  from  (3.5)  that  if  a*,  <  0,  then  of,  ^  0 
for  each  h€H and  hence,  using  (3.14),  (3.16),  we  get  that  (3.9)  holds.  Next,  consider  a*,  >  0 
for  some  >  €  { 1 .  ....  a }.  From  (3.13),  (3.14),  (3.16),  we  get 


(3.17) 


aff  —  max{max  at,  max  af, •y/'} 

'  htH  *€«* 

«?/  >  0 


where  we  have  not  considered  with  af/  <  0  since  af*  >  0.  But  for  h  QH*  with  af,  >  0, 
we  get  from  (3.5),  (3.6) 


(3.18) 


min 

*«  l*>° 


Using  (3.18)  in  (3.17)  yields  aff  -  af/, 


max  alt 

r(H 

<  /I* 

max  a  I , 

riH  ’ 

ah\k 

^  0\i 

< 

which  establishes  (3.9). 


max  aw. 

r€tf 


Finally,  we  show  that  (3.8)  satisfies  both  the  criteria  of  Section  2.  This  part  follows 
immediately  from  (3.9)  by  noting  that  the  cut  (3.5)  yields  9e  -  9*  of  (3.10)  and  9,  -  9  '  of 
(3.12).  This  completes  the  proofs  of  parts  (b)  and  (c). 

(d)  Note  that  since  (3.8)  is  valid,  any  x€S  satisfies  (3.8).  Hence,  in  order  to  show  that 
(3.8)  defines  a  facet  of  5,  it  is  sufficient  to  identify  n  affinely  independent  points  of  S  which 
satisfy  (3.8)  as  an  equality,  since  clearly,  dim  S  -  a.  Define 

(3.19)  yl  -  {y  € { 1 . a):  af*  >  0)  and  let  J2  -  (1 . a)  -  J\. 

Consider  any  p£J t,  and  let 

(3.20)  e-  (0 . -W . 0),  pey, 

have  the  non-zero  term  in  the  p,h  position.  Now,  since  p€Jt,  (3.9)  yields 

a  ip  -  a'p  "  max  af,,  -  a%,  say, 

h  €  H 

Hence,  ^€5^  and  so,  e^ESand  moreover,  ep  satisfies  (3.8)  as  an  equality.  Thus,  e„,  p€y| 
qualify  as  l/J  of  the  a  affinely  independent  points  we  are  seeking. 


Now  consider  a  Let  us  show  that  there  exists  an  satisfying 

y\"  Q\pm  a ff  for  some  />€y, 
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and 

(3.21)  y,V;-«C 

From  Equation  (3.16),  we  get  a''  -  max  a\q  y*-  aJ’jy*',  say.  Then  for  this  h„€H,  Equation 

h  6  H 

h  h  h  h  h  &  ® 

(3.6)  yields  y,«  -  minimum  [a^/a ,’}  -  a’p/atqp,  say.  Or,  using  (3.9),  a,*  -  a,*,  -  a,**  > 

ra^O 

0.  Thus  (3.21)  holds.  For  convenience,  let  us  rewrite  the  set  below  as 

(3.22)  Shi/  -  {*:  a***,  +  a\\xq  +  £  a^x,  >  1,  x  >  0}. 

/*/>.« 

- W . 0)  if  a”  <  0 

A . 0)  if  a*;*=  0 


Now,  consider  the  direction 


(3.23) 


(0, 

(0, 


I 

*  •• 

&  \p  • 

,  0 . 


where  A  >  0.  Let  us  show  that  dq  is  a  direction  for  Sh  .  Clearly,  if  a**  -  0,  then  from  (3.21) 

a*$  -  0  and  thus  (3.22)  establishes  (3.23).  Further,  if  a,**  <  0  then  one  may  easily  verify 
from  (3.21),  (3.22),  (3.23)  that 


^  —  (0 . y\‘,/a |” . 0)  €  and  ep  +  8[y] qdq\  €  for  each  8  >  0 

where  ep  has  the  non-zero  term  at  position  p.  Thus,  dq  is  a  direction  for  Shn.  It  can  be  easily 


shown  that  this  implies  dq  is  a  direction  for  S.  Since  ep 


(0 . 

&  1  n 


,  0)  of  Equation 


(3.20)  belongs  to  S,  then  so  does  ( ep  +  dq).  But  ( ep  +  dq)  clearly  satishes  (3.8)  as  an  equality. 
Hence,  we  have  identified  n  points  of  S,  which  satisfy  the  cut  (3.8)  as  an  equality,  of  the  type 


(3.24) 


ep  -  (0 . ~ . 0)  for />€./, 

eq  -  dq  +  ep  for  some  p€/|,  for  each  ^  €  y2 


where  dq  is  given  by  (3.23).  Since  these  n  points  are  clearly  affinely  independent,  this  com¬ 
pletes  the  proof. 


It  is  interesting  to  note  that  the  cut  (3.5)  has  been  derived  by  Balas  (2]  and  by  Glover  19, 
Theorem  1].  Further,  the  cut  (3.8)  is  precisely  the  strengthened  negative  edge  extension  cut  of 
Glover  19,  Theorem  21.  The  effect  of  replacing  \f’  defined  in  (3.4)  by  X  defined  in  (3.7)  is 
equivalent  to  the  translation  of  certain  hyperplanes  in  Glover’s  theorem.  We  have  hence 
shown  through  Theorem  2  how  the  latter  cut  may  be  derived  in  the  context  of  disjunctive  pro¬ 
gramming,  and  be  shown  to  be  a  facet  of  the  convex  hull  of  feasible  points.  Further,  both 
(3.5)  and  (3.8)  have  been  shown  to  be  alternative  optima  to  the  two  criteria  of  Section  2. 


In  generalizing  this  to  disjunction  DC2,  we  find  that  such  an  ideal  situation  no  longer 
exists.  Nevertheless,  we  are  able  to  obtain  some  useful  results.  But  before  proceeding  to  DC2, 
let  us  illustrate  the  above  concepts  through  an  example. 
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EXAMPLE:  Let  H  -  {1,2},  n  -  3  and  let  DC1  be  formulated  through  the  sets 

S{  -  (x:  x,  +  2x2  -  4x3  >  1.  x  >  0},S2  -  {x:  —  +  —  -  2xj  ^  1,  x  >  0}. 

The  cut  (3.5),  i.e.,  Ia*yX,  >  1,  is  X[  +  2x2  -  2x3  >  1.  From  (3.6), 

y!  -  minj-p  }|  -  1  and  y?  -  min  j-^j,  -  2. 

Thus,  through  (3.7),  or  more  directly,  from  (3.16),  the  cut  (3.8),  i.e.,  I  o'' Xj  >  1  is 
X|  +  2x2  -  4x3  >  1.  This  cut  strictly  dominates  the  cut  (3.5)  in  this  example,  though  both 
have  the  same  values  1/V5  and  1/2  respectively  for  9e  and  9,  of  Equations  (2.2)  and  (2.5). 

4.  DERIVING  DEEP  CUTS  FOR  DC2 

To  begin  with,  let  us  make  the  following  interesting  observation.  Suppose  that  for  con¬ 
venience,  we  assume  without  loss  of  generality  as  before,  that  6,*—  1,  i€Qt,  h€H in  Equation 

(1.4).  Thus,  for  each  h€H,  we  have  the  constraint  set 

(4.1)  Sh  —  Jx:  t^xy>  1,  i€Qh,x>0 J. 

Now  for  each  h€H ,  let  us  multiply  the  constraints  of  Sh  by  corresponding  scalars  8,h  >  0,  i€Qh 
and  add  them  up  to  obtain  the  surrogate  constraint 

(4.2)  I  |  I  x,  £  8/-.  /ME//. 

/-I  [/€(?*  J  '«<?* 

Further,  assuming  that  not  all  8,A  are  zero  for  /€  Qh>  (4.2)  may  be  re-written  as 


(4.3) 


I 

/-! 


I 

8* 

M 

09 

°ij 

|r*  J] 

x,  >  1,  h*H. 


Finally,  denoting  8  ,A  j  £  8 A  by  A  *  for  i€Qh,  h£H,v/e  may  write  (4.3)  as 

/  />«<?* 

(4.4) 


where, 

(4.5) 


£  5*  xi  ^  *  for  each 

/-I  <*Qh 


£  A  *  -  l  for  each  h£H.  A*  >  0  for  /€()*,  />€//. 


Observe  that  by  surrogating  the  constraints  of  (4.1)  using  parameters  A  *,  /€(?*,  A  6  H  satisfying 

(4.5),  we  have  essentially  represented  DC2  as  DC1  through  (4.4).  In  other  words,  since  x6 Sh 
implies  x  satisfies  (4.4)  for  each  A€ff,  then  given  A*,  /€ Qh,  h€H,  DC2  implies  that  at  least 
one  of  (4.4)  must  be  satisfied.  Now,  whereas  Theorem  1  would  directly  employ  (4.2)  to  derive 
a  cut,  since  we  have  normalized  (4.2)  to  obtain  (4.4),  we  know  from  the  previous  section  that 
the  optimal  strategy  is  to  derive  a  cut  (3.8)  using  inequalities  (4.4). 


Now  let  us  consider  in  turn  the  two  criteria  of  Section  2. 
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4.1.  Euclidean  Distance-Based  Criterion 


Consider  any  selection  of  values  for  the  parameters  X*,  itQ/,,  h€H  satisfying  (4.S)  and 
let  the  corresponding  disjunction  DC1  derived  from  DC2  be  that  at  least  one  of  (4.4)  must 
hold.  Then,  Theorem  2  tells  us  through  Equations  (3.5),  (3.10)  that  the  euclidean  distance  cri¬ 
terion  value  for  the  resulting  cut  (3.8)  is 


(4.6) 


where. 


(4.7)  yj  =  max{0,  z,},  j  -  1,  ....  n 


and 


(4.8) 


z,  =  max 

J  hZH 


I  kj'alj  .7=1. 

tze„ 


n. 


Thus,  the  criterion  of  Section  2  seeks  to 

(4.9)  maximize  {0«,(X):  X  =  (X /')  satisfies  (4.5)} 


or  equivalently,  to 


(4.10)  minimize  {}Ty, 2 :  (4.5),  (4.7),  (4.8)  are  satisfied). 

i- 1 

It  may  be  easily  verified  that  the  problem  of  (4.10)  may  be  written  as 

(4.11)  PD2:  minimize  ^  yj 

J- 1 

14.12)  subjection  >  £  X  *  a,*  for  each  h  €  H  for  each  j  =  I . n 

'«<?* 

(4.13)  £  1  for  each  /i € // 

(4.14)  k?  >  0  i€Qh,  h€H 

Note  that  we  have  deleted  the  constraints  y,  >  0,  j  =  1,  ....  n  since  for  any  feasible  X/', 

/€{?/,,  /j  €  //,  there  exists  a  dominant  solution  with  nonnegative  yt  =  j  =  1 n.  This  relax¬ 

ation  is  simply  a  matter  of  convenience  in  our  solution  strategy. 

Before  proposing  a  solution  procedure  for  Problem  PD2,  let  us  make  some  pertinent 
remarks.  Note  that  Problem  PD2  has  the  purpose  of  generating  parameters  X/\  /€()*,  /)€// 
which  are  to  be  used  to  obtain  the  surrogate  constraints  (4.4).  Thereafter,  the  cut  that  we 

derive  for  the  disjunction  DC2  is  the  cut  (3.8)  obtained  from  the  statement  that  at  least  one  of 

(4.4)  must  hold.  Hence,  Problem  PD2  attempts  to  find  values  for  X,\  /€(?*,  /»€//,  such  that 
this  resulting  cut  achieves  the  euclidean  distance  criterion. 

Problem  PD2  is  a  convex  quadratic  program  for  which  the  Kuhn-Tucker  conditions  are 
both  necessary  and  sufficient.  Several  efficient  simplex-based  quadratic  programming  pro¬ 
cedures  are  available  to  solve  such  a  problem.  However,  these  procedures  require  explicit  han¬ 
dling  of  the  potentially  large  number  of  constraints  in  Problem  PD2.  On  the  other  hand,  the 
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subgradient  optimization  procedure  discussed  below  takes  full  advantage  of  the  problem  struc¬ 
ture.  We  are  first  able  to  write  out  an  almost  complete  solution  to  the  Kuhn-Tucker  system. 
We  will  refer  to  this  as  a  partial  solution.  In  case  we  are  unable  to  either  actually  construct  a 
complete  solution  or  to  assert  that  a  feasible  completion  exists,  then  through  the  construction 
procedure  itself,  we  have  a  subgradient  direction  available.  Moreover,  this  latter  direction  is 
very  likely  to  be  a  direction  of  ascent.  We  therefore  propose  to  move  in  the  negative  of  this 
direction  and  if  necessary,  project  back  onto  the  feasible  region.  These  iterative  steps  are  now 
repeated  at  this  new  point. 

4. 1. 1  Kuhn-  Tucker  Systems  for  PD 2  and  Its  Implications 

Letting  u}',  h€H,  j  —  1,  ....  n  denote  the  lagrangian  multipliers  for  constraints  (4.12), 
4,  h€H  those  for  constraints  (4.13),  and  w/1,  /' € Qh ,  /? € //  those  for  constraints  (4.14),  we  may 


write  the  Kuhn-Tucker  optimality  conditions  as 

(4.15) 

£  uj'  -  2y,  j  -  1 . n 

hi  H 

(4.16) 

X  Uj  a!}  +  4  —  w*  —  0  for  each  /  €  Qh , 
j- 1 

and  for  each  //  €  H 

(4.17) 

uj'  1  X  kl'ajj  -  y,  =  0  for  each  j  -  1,  . 

n  and  each  h  €  H 

(4.18) 

X/'  w/*  —  0  for  i€Q„.  h<LH 

(4.19) 

wj'  >  0  /€<?,„  />€// 

(4.20) 

uj'>  0y  -  1 . n,  h€H. 

Finally,  Equations  (4.12),  (4.13),  (4.14)  must  also  hold.  We  will  now  consider  the  implications 
of  the  above  conditions.  This  will  enable  us  to  construct  at  least  a  partial  solution  to  these  con¬ 
ditions,  given  particular  values  of  X/',  i €  Q,, ,  h  G  H.  First  of  all,  note  that  Equations  (4.7), 
(4.10)  and  (4.20)  imply  that 

(4.21)  y7  >  0  for  each  j  =  1 . n 

(4.22)  y,  —  maxfo,  £  X /'«,*,  /;  €  for  j  -  1 . n. 

I  <€<?"  I 

Now,  having  determined  values  for  y(>  j  —  1,  . . .  ,  n ,  let  us  define  the  sets 
|0)  if  y,  -  0 

(4.23)  Ht  -  for  j  —  1 . n. 

ih€H.  y;«  £  X  * a,h,  >  0} 

Now,  consider  the  determination  of  uj',  h€H,  j  -  1 . n.  Clearly,  Equations  (4.15),  (4.17) 

and  (4.20)  along  with  the  definition  (4.23)  imply  that  for  each./  -  1 . n 

(4.24)  Uj  -  0  for  h  €  H/ H}  and  that  £  uf  -  2y>t  uf  ^  0  for  each  h€Hj. 

hi  Hi 

Thus,  for  any  y € { 1 . «},  if  Hj  is  either  empty  or  a  singleton,  the  corresponding  values  for 

h€H  are  uniquely  determined.  Hence,  we  have  a  choice  in  selecting  values  for  uf,  h€Hj 
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only  when  \Hj\  >  2  for  any  ^€{1,  ....  n).  Next,  multiplying  (4.16)  by  X*  and  using  (4.18), 
we  obtain 

(4.25)  £  U  £  +  4  £  X/'  —  0  for  each  h£H. 

j- 1  | 

Using  Equations  (4.13),  (4.17),  this  gives  us 

(4.26)  4  -  -  £  ufyj  for  each  /;€//. 

j-i 

Finally,  Equations  (4.16),  (4.26)  yield 

(4.27)  wj’  =  £  «/’  [a,J  -  yj]  for  each  /  €  Qh ,  /i€//. 

J-i 

Notice  that  once  the  variables  «/,  A€//,  j  *  1 . n  are  fixed  to  satisfy  (4.24),  all  the  vari¬ 

ables  are  uniquely  determined.  We  now  show  that  if  the  variables  w/',  /€  (4,  h€H  so  deter¬ 
mined  are  nonnegative,  we  then  have  a  Kuhn-Tucker  solution.  Since  the  objective  function  of 
PD2  is  convex  and  the  constraints  are  linear,  this  solution  is  also  optimal. 

LEMMA  2:  Let  a  primal  feasible  set  of  x/',  /€  £4*  h€H  be  given.  Determine  values  for 
all  variables  y;,  «/,  4,  w/'  using  Equations  (4.22)  through  (4.27),  selecting  an  arbitrary  solution 
in  the  case  described  in  Equation  (4.24)  if  |W,|  >  2.  If  w/'  ^  0,  / € Qh ,  /?€//,  then  X,*,  / € Qh , 
/j€// solves  Problem  PD2. 


PROOF:  By  construction  Equations  (4.12),  through  (4.17),  and  (4.20)  clearly  hold. 
Thus,  noting  that  in  our  problem  the  Kuhn-Tucker  conditions  are  sufficient  for  optimality,  all 
we  need  to  show  is  that  if  w  *=  (w/0  ^  0  then  (4.18)  holds.  But  from  (4.17)  and  (4.27)  for 
any  h  €  H,  we  have. 


I 


0 


for  each  /;€//.  Thus,  X*  ^  0,  w/'  >  0  /  €  Qh ,  /»€//  imply  that  (4.18)  holds  and  the  proof  is 
complete. 


The  reader  may  note  that  in  Section  4. 1 .4  we  will  propose  another  stronger  sufficient  con¬ 
dition  for  a  set  of  variables  X/',  /€  (4,  /r€//to  be  optimal.  The  development  of  this  condition 
is  based  on  a  subgradient  optimization  procedure  discussed  below. 


4.1.2  Subgradient  Optimization  Scheme  for  Problem  PD 


For  the  purpose  of  this  development,  let  us  use  (4.22)  to  rewrite  Problem  PD2  as  follows. 
First  of  all  define 

(4.28)  A  -{x  -  (X /') :  constraints  (4.13)  and  (4.14)  are  satisfied } 


and  let  /:  A 
(4.29) 


R  be  defined  by 


2 
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Then,  Problem  PD2  may  be  written  as 

minimize  {/(X):  X  €  A}. 

Note  that  for  each  j  —  1,  w,  gj (X )  —  max  (0,  £  X/to,*,  h€H J  is  convex  and  nonnegative. 
Thus,  te;(X)]2  is  convex  and  so  /(X)  -  £  [g/(X)P  is  also  convex. 


7-1 


The  main  thrust  of  the  proposed  algorithm  is  as  follows.  Having  a  solution  X  at  any  stage, 
we  will  attempt  to  construct  a  solution  to  the  Kuhn-Tucker  system  using  Equations  (4.15) 
through  (4.20).  If  we  obtain  nonnegative  values  w/'  for  the  corresponding  variables  w/1,  / €  Qh , 
/i€//,  then  by  Lemma  2  above,  we  terminate.  Later  in  Section  4.1.7,  we  will  also  use  another 
sufficient  condition  to  check  for  termination.  If  we  obtain  no  indication  of  optimality,  we  con¬ 
tinue.  Theorem  3  below  established  that  in  any  case,  the  vector  w  -  w  constitutes  a  subgra¬ 
dient  of  /(•)  at  the  current  point  X.  Following  Poljak  [18,19],  we  hence  take  a  suitable  step  in 
the  negative  subgradient  direction  and  project  back  onto  the  feasible  region  A  of  Equation 
(4.28).  This  completes  one  iteration.  Before  presenting  Theorem  3,  consider  the  following 
definition. 


DEFINITION  1:  Let_/:  A  —  R  be  a  convex  function  and  let  X  6  AC  Rm.  Then  f  €  Rm 
is  a  subgradient  of  /(•)  at  X  if 

/(X )  >  /(X)  +  1;'  (X  —  X)  for  each  X  €  A. 


THEOREM  3:  Let  X  be  a  given  point  in  A  defined  by  (4.28)  and  let  w  be  obtained  from 
Equations  (4.22)  through  (4.27),  with  an  arbitrary  selection  of  a  solution  to  (4.24). 


Then,  w  is  a  subgradient  of  /(•)  at  X,  where  /: A  —  R  is  defined  in  Equation  (4.29). 


PROOF.  Let  v  and  y  be  obtained  through  Equation  (4.22)  from  X  €  A  and  X  €  A  respec¬ 
tively.  Hence, 


/(X)  *•  £  y/  and  /(X)  =  £  yf. 
j- i  j- i 

Thus,  from  Definition  1,  we  need  to  show  that 

htHHQ,,  7-1  7-1 


(4.30) 


Noting  from  Equations  (4.17),  (4.27)  that  £  £  vv/'x/'  —  0,  we  have, 

/i€«  /€(?., 


J,|  X  X  X  X  1 5,Vt.,*-j,l 

A  €  '  €  C?7,  htHi(Qh  hiH /€(?,  /-I 


hiH  i-l  |/€0, 

Using  (4.13)  and  (4.15),  this  yields 


1 1  s;  1 1  ^-zz  1 4 

i- 1  (6  0,, 


I  I  w*  (X *  —  X,*)  —  £  £5*  X  x*«*  -2  27/. 

h*H  h(H  /-I  /({Ij  ,_| 


mumuXhAIl 


z^rMitX 
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Combining  this  with  (4.30),  we  need  to  show  that 


(4.31) 


ii*‘  z 

j(H  i-\  i(Qh 


^  iyf+iy/. 

/“I  7-1 


But  Equations  (4.15),  (4.20),  (4.22)  imply  that 

Z  Z  I  Z  I  £i?rj>-2l^<2lWII|?ll<lblP+H?IP 

hiHj- 1  l/€  hCHj-l  7-1 

so  that  Equation  (4.31)  holds.  This  completes  the  proof. 


Although,  given  X  €  A,  any  solution  to  Equations  (4.22)  through  (4.27)  will  yield  a 
subgradient  of  /(•)  at  the  current  point  X,  we  would  like  to  generate,  without  expending  much 
effort,  a  subgradient  which  is  hopefully  a  direction  of  ascent.  Hence,  this  would  accelerate  the 
cut  generation  process.  Later  in  Section  4.1.6  we  describe  one  such  scheme  to  determine  a 
suitable  subgradient  direction.  For  the  present  moment,  let  us  assume  that  we  have  generated 
a  subgradient  w  and  have  taken  a  suitable  step  size  0  in  the  direction  —  iv  as  prescribed  by  the 
subgradient  optimization  scheme  of  Held,  Wolfe,  and  Crowder  [12].  Let 

(4.32)  X  =  X  -  0  w 


be  the  new  point  thus  obtained.  To  complete  the  iteration,  we  must  now  project  X  into  A,  that 
is,  we  must  determine  a  new  X  according  to 

(4.33)  X„,,H,  =  PA(X)  =  minimum  [i|x  -  x||:  X  €  A). 

The  method  of  accomplishing  this  efficiently  is  presented  in  the  next  subsection. 


4. 1.3  Projection  Scheme 


For  convenience,  let  us  define  the  following  linear  manifold 
(4.34)  Mh  =  lx/'.  /€<?,:  £  X,"-  ll,  /i € // 


1  '€£?"  ) 

and  let  Mh  be  the  intersection  of  Mh  with  the  nonnegative  orthant,  that  is, 
(4.35)  Mh  =  (x/\  /€<?„:  £  X,"  =  1.  X/'  >  0,  /€<?„}. 


Note  from  Equation  (4.28)  that 

(4.36)  A  »  A?|  x  ...  x  M\h\. 

Now,  given  X,  we  want  to  project  it  onto  A,  that  is,  determine  X^  from  Equation  (4.33). 
Towards  this  end,  for  any  vector  a  -  (a,,  /€/),  where  /  is  a  suitable  index  set  for  the  |/|  com¬ 
ponents  of  a,  let  P(a,I )  denote  the  following  problem: 

(4.37)  P(a,l):  minimize  |4  X  “  a^2  :  Z  */  “  1-  >  0,  /€ 

Then  to  determine  X„ew,  we  need  to  find  the  solutions  (X  £*.),,  i€Qh  as  projections  onto  Mh  of 

X*  -  (X,,  /€  Qh)  through  each  of  the  \H\  separable  Problems  P(X  ,  Qh).  Thus,  henceforth  in 
this  section,  we  will  consider  only  one  such  h€H.  Theorem  4  below  is  the  basis  of  a  finitely 
convergent  iterative  scheme  to  solve  Problem  P(X  ,  Qh). 
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THEOREM  4:  Consider  the  solution  of  Problem  P(fik,  /*),  where  /3*  -  (fit,  it  I*),  with 
|/*|  >  1.  Define 


and  let 

(4.39)  pk  -  (3*  +  (pk)  /* 

where  4  denotes  a  vector  of  |/*|  elements,  each  equal  to  unity.  Further,  define 

(4.40)  /*+1 -{/€/*:£,*>  0). 

Finally,  let  /3*+1  defined  below  be  a  subvector  of  fik, 

(4.41)  j8*+1  =  (/3*+1,  /€ 4+1) 

where,  /3,*+l  -  /§,*,  i€/*+1.  Now  suppose  that  /3*+I  solves  P(J3kkl,  /*+1)- 
(a)  If  /§*  >  0,  then  /3*  solves  FOS*.  /*). 


(b) 

(4.42) 


If  pk  >_  0,  then  /3  solves  FI/}*.  /*),  where  /J  has  components  given  by 

j/3*+l,  if  /€/*+,  for  each  itlk. 

P'  ~  10  otherwise 


PROOF:  For  the  sake  of  convenience,  let  RP(a,I)  denote  the  problem  obtained  by 
relaxing  the  nonnegativity  restrictions  in  P(a,l).  That  is,  let 

RP(a,l):  minimize  1^  X  (*<  “  «,)2:  X  “  1  • 

1^/6/  /€/ 

First  of  all,  note  from  Equations  (4.38),  (4.39)  that  pk  solves  RP(f3k,  lk)  since  @k  is  the  projec¬ 
tion  of  pk  onto  the  linear  manifold 

(4.43)  L- (X„ /€/*):  X  */-  *1 

I  '€,‘  I 

which  is  the  feasible  region  of  RP(f3k,  /*).  Thus,  pk  ^  0  implies  that  Pk  also  solves  P(f3k,  Ik). 
This  proves  part  (a) . 

Next,  suppose  that  pk  >_  0.  Observe  that  /3  is  feasible  to  P(fik,  /*)  since  from  (4.42),  we 
get/3  >  0  and  X  “  X  /3*  +  l“  1  as/3*+1  solves  P(f}k+l,  /k+\). 

Now,  consider  any  X-  (X,, /€/*)  feasible  to  P($k,  lk).  Then,  by  the  Pythagorem 
Theorem,  since  /3*  is  the  projection  of  (3k  onto  (4.43),  we  get 

llx-/3*ll2«  1 1\  —  j§*| I2  +  1 1/3*  —  j8*| |2. 

Hence,  the  optimal  solution  to  P(@k,  lk)  is  also _optimaI  to  P(J3k,  lk).  Now,  suppose  that  we 
can  show  that  the  optimal  solution  to  Problem  /’(/?*,  lk)  must  satisfy 

(4.44)  a,  -  0  for  ?/*+,. 

Then,  noting  (4.41),  (4.42),  and  using  the  hypothesis  that  /8*+1  solves  P( jS*+l,  /*+i),  we  will 
have  established  part  (b).  Hence,  let  us  prove  that  (4.44)  must  hold.  Towards  this  end,  con¬ 
sider  the  following  Kuhn-Tucker  equations  for  Problem  P(fik,  lk)  with  t  and  /€/*  as  the 
appropriate  lagrangian  multipliers: 
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(4.46)  ^  Xj  —  l,  Xj  ^  0  for  each  i €  /* 

(4.47)  (X/  -  /J,*)  +  /  -  tv,  =  0  and  w,  >  0  for  each  »€/* 

(4.48)  X ,  k>,  -  0  for  each  /€/*. 

Now,  since  £  fik  —  1,  we  get  from  (4.45),  (4.46)  that 

'=  I  »*',7 1 A !  >  0. 

'«4  / 

But  from  (4.46),  (4.47),  and  (4.48)  we  get  for  each  /€/*, 

0  =  w,X,  -  X,(X,  +  t  -  0,*) 
which  implies  that  for  each  /€/*,  we  must  have, 

either  X,  —  0,  whence  from  (4.46),  w,  =  t  -  fik  must  be  nonnegative 

or  X,  =  f3k  —  r,  whence  from  (4.46),  w,  =  0. 

In  either  case  above,  noting  (4.45),  if  fik  <  0,  that  is,  if  we  must  have  X,  =  0.  This 

completes  the  proof. 

Using  Theorem  4,  one  may  easily  validate  the  following  procedure  for  finding  X^.K  of 
Equation  (4.33),  given  X('.  This  procedure  has  to  be  repeated  separately  for  each  li£H. 

Initialization 

Set  k  —  0,  /3°  =  x",  4  =  Qh.  Go  to  Step  I. 

Step  1 

Given  (3k,  Ik,  determine  p*  and  /3*  from  (4.38),  (4.39).  If  /?*  >  0,  then  terminate  with 
A*,  having  components  given  by 

Pi  if  '  €/* 

(xh  )  = 

new' i  o  otherwise. 

Otherwise,  proceed  to  Step  2. 

Step  2 

Define  /*+ 1,  /3*+l  as  in  Equations  (4.40),  (4.41),  increment  k  by  one  and  return  to  Step  I. 

Note  that  this  procedure  is  finitely  convergent  as  it  results  in  a  strictly  decreasing,  finite 
sequence  1 4 1  satisfying  |/J  >  l  for  each  X,  since  £  /3*  =  1  for  each  k. 

EXAMPLE:  Suppose  we  want  to  project  X*«  (-2,3, 1,2)  on  to  A  C  R4.  Then  the  above 
procedure  yields  the  following  results. 


Initialization 

k-  0.  j3°-  (-2.3, 1.2).  /,,«  (1,2,3. 4}. 


•  ' 
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Step  1 

Po  -  -3/4.  0° 


n  9  1  5 

4  ’  4*  4’  4 


Step  2 

12.3.41.  if 


Step  1 


Pi  ~  - 


Step  2 

k  -  2,  /2-  {2.4}.  p2 


4  2 
3'  3 


Step  1 

P2  =  -y.  /32~  (1,0)  >  0 

Thus.  X^K.  -  (0. 1,0.0). 


4.1.4  A  Second  Sufficient  Condition  for  Termination 

_  As  indicated  earlier  in  Section  4.1.2,  we  will  now  derive  a  second  sufficient  condition  on  >5 
for  A  to  solve  PD2.  For  this  purpose,  consider  the  following  lemma: 

LEMMA  3:  Let  A  €  A  be  given  and  suppose  we  obtain  w  using  Equations  (4.22)  through 
(4.27).  Let  w  solve  the  problem. 

PRh:  minimize  j^-  £  05/'-  wj')2:  £  wj'  =  0,  w/'  <  0  for  i€Jh  1  for  each  /;€// 

r «e*  ) 


where, 

(4.49)  J„  -  {/€<?„:  A/'=  0},  /;€//. 

Then,  if  *5  -  0,  A  solves  Problem  PD2. 

PROOF.  Since  >5-0  solves  PR/,,  h  6  //,  we  have  for  each  h  €  H, 

(4.50)  £  (>5/')J<  £  05/'-  wj')2 
'<<?/,  '«<?/, 

for  all  »■/',  i€  Qh  satisfying  £  w/'-  0,  w>/'  <  0  for  / €  Jh .  Given  any  A  €  A  and  given  any 
fi  >  0  define, 

(4.51)  »’/’  —  (A*  —  A  /')/p .  /  6  .  //  e  //. 
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Then,  £  w/'-  0  for  each  /;€//  and  since  x/1—  0  for  i €  Jh ,  /?€//,  we  get  wj  <  0  for  / 6 , 

/;€//.  Thus,  for  any  X  €  A,  by  substituting  (4.51)  into  (4.50),  we  have, 

(4.52)  fi2  £  (wj)2  ^  £  (X*  —  X*  +  nwj')2  for  each  A  6  H. 

<«(?*  <«<?„ 

But  Equation  (4.52)  implies  that  for  each  /»€//,  x'1  =  X;'  solves  the  problem 

minimize  £  lx/'  —  (A /'  -  ^  w/')P:  £  x/'-=  1.  X/'  ^  0  /€Q/,l  for  each  /;€//. 

In  other  words,  the  projection  P*(X  -  w/i)  of  (X  -  w/j.)  onto  A  is  equal  to  X  for  any  n  —  0. 

In  view  of  Poljak’s  result  (18,191,  since  w  is  a  subgradient  of  /(•)  at  X,  then  X  solves  PD2. 
This  completes  the  proof. 

Note  that  Lemma  3  above  states  that  if  the  "closest"  feasible  direction  -w  to  -w  is  a  zero 
vector,  then  X  solves  PD2.  Based  on  this  result,  we  derive  through  Lemma  4  below  a  second 
sufficient  condition  for  X  to  solve  PD2. 

LEMMA  4:  Suppose  w  —  0  solves  Problems  PRh ,  /;€//  as  in  Lemma  3.  Then  for  each 
li  €  //,  we  must  have 

(4.53)  (a)  wj -  th,  a  constant,  for  each  i€Jh 

( b )  wj  ^  i h  for  each  i  €  Jh 
where  Jh  is  given  by  Equation  (4.49). 

PROOF:  Let  us  write  the  Kuhn-Tucker  conditions  for  Problem  PRh ,  for  any  h€H.  We 
obtain 

(wj  -  wj)  4-  th  -  0  for  i<lJh 

( wj'  -  wj)  +  th  -  uj  =  0  for  /  €  Jh 

uj  ^  0,  i  €  Jh ,  uj  wj  -  0  i  €  Jh ,  unrestricted 

£  w/’-O.  w/'  >  0  for 

><Q„ 

If  w  =  0  solves  PR/,,  /)€//,  then  since  PRh  has  a  convex  objective  function  and  linear  con¬ 
straints,  then  there  must  exist  a  solution  to 

wj  =  0,  for  each  i$Jh 
and 

uj  —  (/y,  -  w/1)  >  0  for  each  /€/,. 

This  completes  the  proof. 

Thus  Equation  (4.53)  gives  us  another  sufficient  condition  for  X  to  solve  PDj.  We  illus¬ 
trate  the  use  of  this  condition  through  an  example  in  Section  4.1.7. 

4.  1.5  Schema  of  an  Algorithm  to  Solve  Problem  PD2 

The  procedure  is  depicted  schematically  below.  In  block  I ,  an  arbhrar  or  preferably,  a 
good  heuristic  solution  X  €  A  is  sought.  For  example,  one  may  use  X,'  -  l/|(jJ  for  each 
it;Qh,  for  X€/f  For  blocks  4  and  6,  we  recommend  the  procedural  steps  proposed  by  Held, 
Wolfe  and  Crowder  (12]  for  the  subgradieni  optimization  scheme. 
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4. 1. 6  Derivation  of  a  Good  Subgradient  Direction 

In  our  discussion  in  Section  4.1.1,  we  saw  that  given  a  A  6  A  of  Equation  (4.28),  we  were 
able  to  uniquely  determine  .v(,  j  —  1,  ....  n  through  Equation  (4.22).  Thereafter,  once  we 

fixed  values  wf  for  u,\  y  —  1 . n,  h  6  H  satisfying  Equation  (4.24),  we  were  able  to  uniquely 

determine  values  for  the  other  variables  in  the  Kuhn-Tucker  System  using  Equations  (4.26), 

(4.27).  Moreover,  the  only  choice  in  determining  w*.  y  -  1 . n,  h£H arose  in  case  |//,|  > 

2  for  some  j  €  {l . n)  in  Equation  (4.25).  We  also  established  that  no  matter  what  feasible 

values  we  selected  for  u,\  y'€  {1 . /»},  h€H,  the  corresponding  vector  w  obtained  was  a 

subgradient  direction.  In  order  to  select  the  best  such  subgradient  direction,  we  are  interested 
in  finding  a  vector  w  which  has  the  smallest  euclidean  norm  among  all  possible  vectors 
corresponding  to  the  given  solution  A  €  A.  However,  this  problem  is  not  easy  to  solve.  More¬ 
over,  since  this  step  will  merely  be  a  subroutine  at  each  iteration  of  the  proposed  scheme  to 
solve  PD2,  we  will  present  a  heuristic  approach  to  this  problem. 

Towards  this  end,  let  us  define  for  convenience,  mutually  exclusive  but  not  uniquely 
determined  sets  Nh,h€H as  follows: 

(4.54)  Nh  c  {y € { I . n):  /?€//,  of  Equation  (4.23)) 

(4-55)  A/,  D  Nj  -  (0)  for  any  /.  >€//  and  U  Nh  =  {y  € { 1 . n}\yj  >  0). 

hi  H 

In  other  words,  we  take  each  y€(l . n)  which  has  y,  >  0,  and  assign  it  to  some  /»€//,, 

that  is,  assign  it  to  a  set  Nh<  where  /»€//,.  Having  done  this,  we  let 

(2J,  ify€fy, 

(4.56)  «/- |0  otherwise  for  eachy€{l . «),/,€//. 

Note  that  Equation  (4.56)  yields  values  u'j  for  «/’,  y€{l . n),  /;€//  which  are  feasible  to 

(4.24).  Hence,  having  defined  sets  Nh,  /i€//as  in  Equations  (4.54),  (4.55),  we  determine  «/, 
y€|l.  ....  w),  /»€// through  (4.56)  and  hence  w  through  (4.27). 

Thus,  the  proposed  heuristic  scheme  commences  with  a  vector  w  obtained  through  an 
arbitrary  selection  of  sets  Nh,  h£H satisfying  Equations  (4.54),  (4.55).  Thereafter,  we  attempt 
to  improve  (decrease)  the  value  of  w'w  in  the  following  manner.  We  consider  in  turn  each 
y'€|l . w)  which  satisfies  \Hj\  >  2  and  move  it  from  its  current  set  Af*  ,  say,  to  another  set 
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Nh  with  h€Hj,  hj,  if  this  results  in  a  decrease  w'w.  If  no  such  single  movements  result  in 
a  decrease  in  w'w ,  we  terminate  with  the  incumbent  solution  w  as  the  sought  subgradient  direc¬ 
tion.  This  procedure  is  illustrated  in  the  example  given  below. 

4. 1. 7  Illustrative  Example 


The  intention  of  this  subsection  is  to  illustrate  the  scheme  of  the  foregoing  section  for 
determining  a  good  subgradient  direction  as  well  as  the  termination  criterion  of  Section  4. 1 .4. 


Thus,  let  H  -  {1,2},  n  —  3,  |£),|  «  |@2|  -  3  and  consider  the  constraint  sets 


x:  2x,  -  3x2  +  x3  >  1 

x:  3x,  -  x2  —  x3  >  1 

-  x,  +  2x2  +  3x3  >  1 

2x|  +  x2  —  2x3  >  1 

3x,  —  x2  —  x3  ^  1 

and  S2  = 

-x,  +  3x2  +  3x3  ^  1 

x,,  x2,  x3  >  0 

X|,  x2,  x3  >  0 

Further,  suppose  we  are  currently  located  at  a  point  A  with 

A,'  =  0,  X2*  =  5/12,  Aj  =  7/12;  A,2  =  7/12,  A22  *=  0,  A 32  -  5/12. 
Then  the  associated  surrogate  constraints  are 

yX,  +  yX2  +  yX3  >  1  for  h  -  1 

(4.57) 


yx,  +  yx2  +  yXj  >1  for  h  -  2. 

Using  Equations  (4.22),  (4.25),  we  find 

>"1  -  y  with  //,  -  {1,21,  J2-  y  with  H2  -  |2)  andJ^j-  y  with  //3  -  {1.2}. 


Note  that  the  possible  combinations  of  TV,  and  TV2  are  as  follows: 

(i)  AT, -01.  TV2-  {2.3}. 

(ii)  TV,  —  {<#,}.  TV2-  {1,2,3}, 

(iii)  TV,  -  {1,3},  Af2-  {2},  and 

(iv)  TV,  =  {3}.  TV2-{  1,2}. 


A  total  enumeration  of  the  values  of  u  obtained  for  these  sets  through  (4.56)  and  the 
corresponding  values  for  w  are  shown  below. 


/v. 

*2 

1....,  «} 

wth,  (€ i 

/»€// 

w'w 

«i 

W 

w2‘ 

w3' 

w ? 

W 

wj 

(1) 

{2,3} 

8/3 

0 

0 

0 

4/3 

4/3 

16/9 

-56/9 

40/9 

-40/9 

-28/9 

56/9 

129.78 

{*} 

{1,2,3} 

0 

0 

0 

8/3 

4/3 

4/3 

0 

0 

0 

0 

-4/3 

0 

1.78 

{1,3}  {2} 

8/3 

0 

4/3 

0 

4/3 

0 

20/9 

-28/9 

20/9 

-20/9 

4/9 

28/9 

34.37 

{3} 

{1,2} 

0 

0 

4/3 

8/3 

4/3 

0 

-4/9 

28/9 

-20/9 

20/9 

20/9 

-28/9 

34.37 

Thus,  according  to  the  proposed  scheme,  if  we  commence  with  TV,  -  {!},  N2  -  {2,3},  then 
picking  j  -  1  which  has  \Hj\—  2,  we  can  move  j  -  1  into  N2  since  2€//|.  This  leads  to  an 
improvement.  As  one  can  see  from  above,  no  further  improvement  is  possible.  In  fact,  the 
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best  solution  shown  above  is  accessible  by  the  proposed  scheme  by  all  except  the  third  case 
which  is  a  "local  optimal". 


We  now  illustrate  the  sufficient  termination  condition  of  Section  4.1.4.  The  vector  w 
h-\  ,  _  /»— 1  h-2 

obtained  above  is  (0,0,0|0,  -4/3,  0).  Further  the  vector  A  is  (  0,  5/12,  7/12|7/12,  0,  5/12). 
Thus,  even  though  S'  jh  0,  we  see  that  the  conditions  (4.53)  of  Lemma  6  are  satisfied  for  each 
h€H  •*  (1,2)  and  thus  the  given  A  solves  PD2. 


The  disjunctive  cut  (3.8)  derived  with  this  optimal  solution  A  is  obtained  through  (4.57) 
as 

(4.58)  y*|  +  yX2  +  yX3  ^  1. 

It  is  interesting  to  compare  this  cut  with  that  obtained  through  the  parameter  values  A/'«= 
1/1  Qh  I  for  each  /  €  Qh  as  recommended  by  Balas  [1,2].  This  latter  cut  is 

(4.59)  yXi  +  x2  +  x3  >  1. 

Observe  that  (4.58)  uniformly  dominates  (4.59). 

4.2  Maximizing  the  Rectilinear  Distance  Between  the  Origin  and  the  Disjunctive  Cut 

In  this  section,  we  will  briefly  consider  the  case  where  one  desires  to  use  rectilinear 
instead  of  euclidean  distances.  Extending  the  developments  of  Sections  2,  3  and  4.1,  one  may 
easily  see  that  the  relevant  problem  is 

minimize  (maximum  y,:  constraints  (4.12),  (4.13),  (4.14)  are  satisfied). 

The  reason  why  we  consider  this  formulation  is  its  intuitive  appeal.  To  see  this,  note  that  the 
above  problem  is  separable  in  h  €  H  and  may  be  rewritten  as 

PDj:  minimize  jf £h  ^  £  kj'ajj  for  each  /  =  1 . n,  £  A/'*=  1,  A*  >  0 

\  '€<?„  '«<?(, 

for  /  €  Qh ,  £ h  >  o|  for  each  /?€//. 

Thus,  for  each  />€//,  PD|  seeks  A/',  /€(?,,  such  that  the  largest  of  the  surrogate  constraint 
coefficients  is  minimized.  Once  such  surrogate  constraints  are  obtained,  the  disjunctive  cut 
(3.8)  is  derived  using  the  principles  of  Section  3. 

As  far  as  the  solution  of  Problem  PD]  is  concerned,  we  merely  remark  that  one  may 
either  solve  it  as  a  linear  program  or  rewrite  it  as  the  minimization  of  a  piecewise  linear  convex 
function  subject  to  linear  constraints  and  use  a  subgradient  optimization  technique. 
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ABSTRACT 

The  reliability  of  a  serial  production  line  is  optimized  with  respect  to  the  lo¬ 
cation  of  a  single  buffer.  The  problem  was  earlier  defined  and  solved  by  Soy- 
ster  and  Toof  for  the  special  case  of  an  even  number  of  machines  ail  having 
equal  probability  of  failure.  In  this  paper  we  generalize  the  results  for  any 
number  of  machines  and  remove  the  restriction  of  identical  machine  reliabili¬ 
ties.  In  addition,  an  analysis  of  multibuffer  systems  is  presented  with  a  closed 
form  solution  for  the  reliability  when  both  the  number  of  buffers  and  their 
capacity  is  limited.  For  the  general  multibuffer  system  we  present  an  approach 
for  determining  system  reliability. 


1.  INTRODUCTION 

Several  types  of  production  line  models  appear  in  the  literature.  Each  one  is  a  realization 
of  a  different  real  life  situation.  A  summary  of  the  various  types  and  the  differences  in  the 
mechanism  of  product  flow  among  them  appears  in  Buzacott  [S],  Koenigsberg  [9],  Toof  [14]  or 
Buxey  et  al  [1].  Recently  Soyster  and  Toof  [13]  defined  a  serial  production  line,  which  is  the 
model  analyzed  in  this  paper. 

The  mechanism  of  product  flow  in  a  serial  production  line  is  described  via  Figure  1.  An 
unlimited  source  of  raw  material  exists  before  machine  1.  If  machine  1  is  capable  of  working 
(i.e.,  not  failed),  an  operator  takes  a  unit  of  raw  material  and  processes  it  on  machine  1,  after 
which  he  moves  to  machine  2  and  processes  it  on  machine  2,  if  machine  2  is  capable  of  work¬ 
ing.  He  proceeds  analagously  until  machine  N  where  a  finish  product  is  completed.  Let  T,  be 

N 

the  process  time  on  machine  /.  Then  the  cycle  time  of  the  system  T  -  ]£  T,.  Let  q,  be  the 

/- 1 

probability  that  at  any  cycle  T  machine  i  is  capable  of  working  and  p,  -  1  -  q,  the  probability  of 
failing.  The  serial  production  line  with  no  buffer  must  stop  working  if  any  of  the  individual 
machines  on  the  line  fails.  The  placement  of  a  single  buffer  of  capacity  M  after  machine  i 
alleviates  this  situation.  If  any  of  the  first  /  machines  fail  and  the  buffer  is  not  empty,  machines 


•This  study  was  done  when  the  author  was  at  the  Department  of  Energy.  Washington,  D.C.  under  the  provisions  of  the 
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Product  Flow 

Figure  I.  Serial  production  line  with  V  machines  and  a  single  buffer 


,  +  1,  /  +  2,  ....  Acan  still  function.  Conversely,  if  any  of  the  machines  /  +  1,  ....  A  fail 
and  the  buffer  is  not  full,  the  first  /  machines  may  still  work  and  produce  a  semifinished  good  to 
be  stored  in  the  buffer.  One  obviously  would  like  to  identify  the  optimal  placement  of  this 
buffer.  Soyster  and  Toof  [13]  proved  that  if  there  are  an  even  number  of  machines,  all  identi¬ 
cally  reliable  (q,  -  q  V  /)  then  the  optimal  placement  of  the  buffer  is  exactly  in  the  middle  of 
the  line,  in  section  2  we  generalize  these  results  for  any  number  of  machines  not  necessarily 
identically  reliable.  Specifically,  we  prove  that  the  optimal  placement  of  a  single  buffer  is  at  a 
place  which  minimizes  the  absolute  value  of  the  difference  between  the  reliability  of  the  two 
parts  of  the  line  separated  by  the  buffer. 


The  optimal  location  /'*  is  determined  from  (1) 


(1) 


II  ?  -  ft 

/-I  /-/•+! 


n<?  -  n «/ 

/-i  /-«+i 


A  more  difficult  question  is  the  optimal  locations  of  several  buffers.  In  section  3  we  analyze  a 
special  case  of  a  two  buffer  system,  each  buffer  having  a  capacity  of  one  unit.  In  section  4  we 
present  an  approach  that  can  be  used  for  any  number  of  buffers  with  any  capacity.  The 
approach  we  suggest  is  efficient  as  long  as  the  number  of  buffers  and  their  capacity  remains 
relatively  small. 


2.  OPTIMAL  LOCATION  OF  A  SINGLE  BUFFER 


Let  a  single  buffer  with  capacity  M  be  placed  after  machine  /.  Let  a,  -  []i},, 

,v 

0,  «  J"[  qh  pj  =  (a,  -  a,0,)/(0 ,  -  a,0,),  and  let  X„  be  the  number  of  units  in  the  buffer  at 

/-/  +  1 

the  beginning  of  cycle  n.  Soyster  and  Toof  [13]  have  shown  that  X„  defines  a  finite  Markov 
Chain,  presented  its  transition  matrix  and  found  that  the  reliability  Rti)  of  the  line  is  given  by 
(2)  and  (3): 

Pi  ~  p,w+l 

(2)  R  (/)  -0,a,+0,  (1  —  a,)  -■  ~  ifa,  *0, 

1  - 

M 

(3)  R  (»)  -  0,a,  +  0,  (1  -  a,)  if  a,  -  fi,. 

One  has  to  maximize  /?(/')  with  respect  to  /,  that  is,  to  identify  the  optimal  location  of  the 

.  V  N 

buffer  within  the  line.  Since  a,/8,  -  ru  n  <7,  -11  4,  is  a  constant  and  does  not  affect  the 

/-i  /“/+ i  /-I 

location  of  the  buffer,  one  can  simply  ignore  this  term  from  (2)  and  (3)  in  the  optimization 
phase.  Thus,  we  want  to  find  /’that  maximizes  R  (i)  or: 

if  a,  ^  0, 
if  a,  -  0,. 


(4) 


/?(/*)  -  Max  R(i)  -  Max 


M+l 


0,(1  -  a,) 
0,0  ) 


Pi  ~  Pi 
1-P,"+I 
M 

M  +  1 
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The  approach  we  take  to  solve  (4)  for  i*  is  to  show  that  R  0)  is  strictly  increasing  with  <*/  for 
a,  <  1 8/  and  strictly  decreasing  with  a,  for  a,  >  0,;  that  a,  —  0,  occurs  when  R  (/)  reaches  its 
maximum  value;  and  that  R  O')  is  symmetric  about  the  point  /*  where  <*,.  -  0 

Let 

p /w+l  —  pi 

(5)  RU)  -  03, ,  -«,<>,)  -  ^  ~7~  a,  *Pi 

Pi  ~  1 

when  a,  -  /3,,  p,  -  1  and  (5)  becomes  (6) 

(6)  R(i)=  **-*>■ 

Note  in  (6)  as  M  becomes  large  the  total  reliability  of  the  line,  which  is  equal  to  a,0,  +  RU), 
approaches  0,.  That  is,  the  two  segments  of  the  line  become  independent  of  each  other. 

In  this  section  the  general  strategy  is  to  show  that  if  a, >  Pt  or  a,  <  0,  then  the  reliabil¬ 
ity  of  (5)  is  smaller  than  the  reliability  of  (6).  Hence,  we  treat  a,  as  a  continuous  variable  and 
show  that  the  derivative  of  (5)  with  respect  to  a,  is  positive  for  a,  <  0,  and  negative  for 
«;  >  Pi- 


The  derivative  of  R  (1)  with  respect  to  is: 


dRU)  1 

(— «,0, 

<p"+1-p,)  + 

(A/p"+l-(M+  l)p,M+  1) 

da,  p,M+'~  1 

1  ./ 

(p"+,-l)2 

LEMMA  1:  The  additioi^ 


«,  over  the  range 


1  N 

1/2 

,1 

J/?(/) 

x  n 

reliability  function  R(i),  is  strictly  increasing  with  respect  to 
,  and  strictly  decreasing  with  respect  to  «,  over  the  range 

That  is,  if  0  <  a,  <  0„  then  >  0.  Conversely,  if  0,  <  a,  1,  then 


F 


n* 

<-i 


■A  V.  I  IIV  ^IVWI  VMI I  VV  >v-iiw  - - - S' 

aa, 

from  the  right;  ihe  second  range  is  open  from  the  left  and  closed  from  the  right). 


THEOREM  1:  The  optimal  placement,  /*,  of  a  single  buffer  of  integer  capacity  M  in  an  N 
machine  line  is  where  a  *  —  0 


PROOF:  The  proof  of  this  theorem  is  essentially  complete.  We  must  only  show  that  (5) 
is  continuous  at  the  point  where  a*“  0*.  By  definition  the  additional  reliability  attributable  to 
the  introduction  of  the  buffer  when  a  *  -  0  *  is: 


W—^-B+T 

As  at  —  pi  pt—  1  so  that  in  (5)  the  limit  of  the  steady  state  probability  as  a,  — • 
indeterminate  form  0/0.  However,  an  application  of  L’Hospital’s  rule  shows  that: 


lim 
<1,  -a, 


Pi 


M+l 


,  M+l 


~Pi 

-1 


M 

M  +  1 


Pi  is  of  the 


and  thus  the  continuity  is  proven. 
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Theorem  1  defines  an  optimal  though  not  necessarily  feasible  solution  to  the  problem  of 
buffer  placement.  The  condition  a,  —  /3,  may  be  impossible  to  satisfy.  In  the  remainder  of  this 
section  we  examine  the  symmetry  of  the  reliability  function  defined  by  Equation  (S),  develop  a 
simple  criterion  that  provides  the  best  feasible  solution  and,  lastly,  we  examine  the  special  case 
of  identical  machine  reliability,  i.e.,  q,  -  q  V  /. 

LEMMA  2:  Given  K\  and  K2  continuous  variables  such  that  a*,  -  0*,  “  Pk2  ~  «a2- 
Then  ■  pK}~  1. 


N 

PROOF:  Recall  that  afi,  -  JI<f.  -  Q  a  constant  for  all  /.  Thus  the  condition 

/*■  1 

“a  ~  Pk  ”  /Sr  -  «*■  may  be  rewritten  aK, - —  aK  .  This  implies  that: 

112  2  '  Of  j  2 

C?(«A,  +  «A3) 

«a,  +  “A, - - - »r  ‘hat  a*:  “a,  “  Q 

Similarly,  one  obtains  the  result  that  0 k^k1  ”  0-  We  want  to  show  that  pKl  •  pK}  -  1.  Sub¬ 
stituting  for  pK[  and  p*2  in  the  definition  of  p  yields: 

(“A,  ~  (?)  («Aj  "  (?) 

P*,P*J=  03a,"  (?)  (0a2~  0)‘ 

We  then  must  show  that: 

<«a,  -  (?)  («a2  "  (?)  =  03a,  -  (?)  03a2  ~  (?) 

or  that. 

“A,«Aj  “  (?(«a,  +  «A2)  =  0a,  0k2  -  (?03a,  +  02)- 

The  condition  -  pK]  —  0*2  -  a*;  infers  both  that  at  |  +  «a,  —  0*,  +  0*2  and  that 
“A,aA2  “  /3a,/3a2  “  (?,  and  thus  the  proof  is  complete. 

This  leads  directly  to  the  following  theorem: 

THEOREM  2:  For  a  continuous  argument  (/'),  R (/)  is  symmetric  about  the  point  /* 
where  =  /3,.. 

The  proof  is  in  [14], 

The  placement  of  the  bufTer  has  been  treated  as  a  continuous  variable.  While  this  has  led 
to  satisfying  mathematical  results,  in  reality  one  must  develop  an  optimizing  criterion  which  is 
physically  feasible.  Unfortunately,  the  condition  a,.  —  0,.  does  not  satisfy  the  feasibility 
requirements.  Rarely  will  /*  be  integer  and  what,  for  example,  is  the  physical  interpretation  of 
/*  ”  7.63.  To  this  end,  it  will  be  shown  in  this  section  that  the  steady  state  reliability  of  the 
line  is  maximized  by  placing  the  buffer  after  machine  /'*  (/‘integer)  where  /‘satisfies  the  fol¬ 
lowing  condition: 

la,.  -  0,.|  -  min  |a,  -  /3,|. 

i*  .v 

Note  that  if  an  integer  /*  exists  such  that  a,.  -  n*,-  n  q,  -  0,.,  it  would  satisfy  the 

/-i  /-r+i 

above  criterion  and  be  consistent  with  Theorem  I. 
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To  this  end  observe  that  la,  -  j8,|  is  a  convex  function  of  a,  that  obtains  its  minimum 
point  at  a  i  =  /8,  -  y/afi,  —  yfQ.  Thus,  for 

a,  <  «,  <  I o,  ~  0,1  <  |a,  -  0,|,  and  for  -JQ  <  a ,  <  a„  |a,  -  j9y|  <  |a,  -  j3,-|. 

THEOREM  3  (Fundamental):  The  optimal  integer  placement  of  a  single  buffer  of  capa¬ 
city  M  in  an  W  machine  line  is  where  |a,  -  0,1  is  minimized. 

PROOF:  From  Theorem  1  we  know  that  by  treating  i  as  a  continuous  variable  the  optimal 
placement  i*  satisfies  a,.  -  0,..  If  /'*  is  integer  the  theorem  is  evident.  Assume  that  /*  is  not 
integer.  Examine  the  points  [/*]  and  [/*  +  1],  From  lemma  1  and  the  convexity  of  |a,  -  0,| 
we  know  that  /?([/*])  >  R(Kt)  where  <*[,.]  >  aK  and  R  ([/*  +  1])  >  R(K2)  where 
a[,.+n  >  aKr  Thus,  the  only  two  candidate  placements  are  [/'*]  and  [/*  +  1]. 

If  Icq,.]  -  /3(,.||  =  la|,.+1]  -  Pu*+u\  then  the  theorem  holds  and  either  placement  is 
optimal.  Therefore,  assume  that  [«[/•]  —  /3(,»)|  <  |ot[,.+i|  —  0(,.+i]|.  We  want  to  show  that 
/?([/*])>/?  ([/*+!]).  Assume  the  contrary,  i.e.,  that  /?([/•+ 1])  >  /?((/'*]).  From 
Theorem  2  we  know  that  there  exists  a  point  K*  such  that  R  (K*)  -  /?([/*  +  1])  and  that 

!«*•  ~  Pk'\  “  l«(/*+ii  ~  /3[,-*+i|l.  This  implies  that  R(K*)  >  /?([/*]).  We  know  that 

I«a*-/3a*I  >  l«|,-j  —  /3[,-il  and  since  both  a*,  and  a|/.]  must  be  greater  than  yjafij  this 

implies  that  a*.  >  <*(,.).  By  Theorem  2  this  would  infer  that  /?([/'*])  >  R(K*)  which  is  a 

contradiction.  Similar  results  may  be  obtained  by  assuming  that  |a[,.|  —  j8(,.||  >  |a[/.+u 
-  ^  l, •+ ul- 

Theorem  3  details  a  simple,  yet  elegant  criterion  for  the  optimal  placement  of  a  single 
buffer  regardless  of  capacity  so  as  to  maximize  the  reliability  of  the  system. 

A  Special  Case:  4,  =  q  V  i. 

Consider  the  case  where  q,  -  V  '•  In  this  case: 

a,  =  q' 

n  _  -  .V  —  / 

P,  ~  <7  • 

It  follows  from  Theorems  1  and  3  that  if  N  is  even,  the  optimal  placement  would  be  where 
a/  —  p,  which  in  this  case  is  where  q‘ —  qN~‘  which  is  satisfied  at  /  —  N/2.  This  is  consistent 
with  the  results  developed  by  Soyster  and  Toof  113]. 

Assume  that  N  is  odd.  Then  N  is  of  the  form  IK  +  1  where  K  is  integer  and  by 
Theorem  3  the  optimal  placement  is  either  after  machine  K  or  machine  K  +  1  since: 

I,*-,**' I 

l«*+,  -  k*+'  -  I  -  \q«- ^+,l- 

We  have  just  completed  the  proof  for  the  optimal  location  of  a  single  buffer  on  an  N  machine 
serial  line.  The  optimal  location  is  for  any  A  (even  or  odd)  and  for  any  q,  (both  when  machine 
reliability  are  identical  or  not  identical  for  all  machines).  In  the  next  section  we  generalize  the 
model  to  include  more  than  one  buffer. 

3.  TWO  BUFFERS  OF  CAPACITY  ONE  UNIT 

Consider  a  simpler  case  of  the  general  model  where  N  -  3k  and  9,  -  q  for  all  The 
placement  of  two  buffers  separates  the  line  into  three  segments.  Since  N  -  3k,  one  may  arbi¬ 
trarily  place  the  first  buffer  immediately  after  machine  k  and  the  second  immediately  after 
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machine  2k.  The  placement  of  these  two  buffers  has  just  defined  the  three  stages  of  the  sys¬ 
tem.  Each  stage  may  be  comprised  of  more  than  one  machine;  for  a  line  of  N  -  3k,  each  stage 
is  comprised  of  k  machines.  The  reliability  of  each  stage  is  Qy  —  Q2  ■*  Q)  “  “  Q  and 

P  -  1  -  Q. 

The  two  buffer  system  operates  analogously  to  the  one  buffer  system  described  in  section 
2.  If  all  machines  are  up,  then  a  unit  of  raw  material  is  processed  by  stages  one,  two  and  three 
and  a  finished  good  is  produced.  If,  for  example,  stage  three  is  down,  stages  one  and  two  are  up 
and  buffer  two  is  not  full,  then  both  stages  one  and  two  operate  and  a  semicompleted  good 
would  be  stored  in  buffer  two.  If  buffer  two  had  been  full  and  buffer  one  had  not,  then 
machine  two  would  not  operate;  it  would  be  blocked  by  the  second  buffer  which  is  full.  In  this 
case  only  machine  one  would  operate  and  a  semiprocessed  good  would  be  stored  in  buffer  one. 

Define  an  ordered  pair  ( X .  T)  where  X  represents  the  quantity  of  semifinished  goods  in 
buffer  one  at  the  start  of  cycle  i,  and  Y  the  quantity  in  buffer  two  at  the  start  of  cycle  i.  If  we 
assume  that  the  maximum  capacity  of  both  buffers  one  and  two  is  one,  then  the  pair  (2f,  Y) 
may  take  on  the  following  four  values:  (0,0),  (1,0),  (0,1),  and  (1,1).  The  one  cycle  transition 
probability  from  state  ( X ,  Y)  -  (0, 0)  to  ail  states  is: 

•  Both  are  empty  at  the  start  of  cycle  t  +  1  if  either  all  stages  are  up,  or  if  stage  one  is 
down.  Thus:  />[<*,+  ,.  K,+1)  -  (0,0)  |  (X„  Y,)  -  (0,0)1  -  Q3  +  P. 

•  If  stage  one  is  up  during  cycle  t  but  stage  two  is  down,  then  a  unit  of  raw  material  is 
processed  on  stage  one  and  the  semicompleted  good  stored  in  buffer  one.  Thus: 
/>[U,+I,  T,+1)  -  (1,0)  |  (X„  Y,)  -  (0,0)1  -  QP. 

•  If  both  stages  one  and  two  are  up  but  stage  three  is  down,  then  a  unit  of  raw  material  is 
processed  on  both  stages  one  and  two  and  the  semicompleted  good  stored  in  buffer  two. 
Thus:  />[(*, +l,  Y,+  l)  -  (0, 1)  |  (*„  Y,)  -  (0,0)1  -  Q2P. 

•  Lastly,  note  that  it  is  impossible  for  (X,+\,  Y,+\)  to  equal  (1,1)  given  that 
( X ,,  Y,)  -  (0,0),  as  at  most,  one  unit  may  be  added  to  storage  during  any  cycle.  Thus: 
P(U,+1,  T,+1)  -  (1. 1)  I  (X„  Y,)  -  (0,0)1  -  0. 

One  may  compute  the  transition  probabilities  for  all  of  the  four  possible  states  in  an  analogous 
manner.  The  complete  transition  matrix  is  presented  in  Figure  2. 


State 

in 

\t+l 
State  in  t^\_ 

(0,0) 

(1,0) 

(0,1) 

(1,1) 

(0,0) 

q3+p 

QP 

q2p 

0 

(1,0) 

q2p 

Q}+P 

QP2 

Q2P 

(0,1) 

QP 

q2p 

Q3+P 2 

QP 

(1,1) 

0 

QP 

Q2P 

Q3+P 

Figure  2.  Transition  matrix  —  two  buffer  system 


Let  iT|,  n2,  rrj,  tt4  be  the  steady  state  probabilities  of  buffer  states  (0,0),  (1,0),  (0,1)  and 
(1,1)  respectively.  The  system  is  in  state  (0,0)  with  probability  irI(  then  a  good  is  produced  if 
and  only  if  all  three  stages  are  up.  This  event  has  a  probability  of  Q}n Similarly,  with  proba¬ 
bility  ir 2  the  system  is  in  state  (1,0),  then  only  stages  two  and  three  must  be  up  for  a  finished 
good  to  be  produced.  This  event  has  probability  Q2ir 2.  Lastly  in  both  state  (0,1)  and 
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(1,1),  buffer  two  is  not  empty  and  thus  the  only  condition  for  a  successful  cycle  is  that  stage 
three  must  be  up.  These  events  have  probability  Qir 3  and  0rr4,  respectively.  The  steady  state 
reliability,  R ,  of  the  two  buffer  system  where  the  capacity  of  both  buffer  one  and  buffer  two  is 
one  unit  is  equal  to: 

(7)  R  =  Q37T\  +  Q27T2  +  Q ni  +  Qv  4- 


Thus,  upon  determining  the  steady  state  probabilities,  tt,,  tt2,  ir3  and  rr4,  one  has  an 
exact  formulation  of  the  reliability  of  the  three  stage,  two  buffer  system,  where  each  buffer  has 
a  capacity  of  one  unit. 

From  the  transition  matrix  presented  as  Figure  2  and  basic  finite  Markov  Chain  theory, 
one  can  calculate  tru  tt2,  rr3,  and  ir4  in  the  following  manner. 

First,  we  know  that  in  the  steady  state  ttB  =  n  where  B  is  the  one  step  transition  matrix 
of  the  system  (Figure  2)  and 

n  —  (it  |,  it  2.  if),  rr4). 


This  identity  yields  a  system  of  four  simultaneous  equations  of  the  form 
(8)  ir(fl-/)  =  0 


where  B  is  the  form: 


Q}  +  P 

QP 

Q2P 

0 

q2p 

Q}  +  P 

QP2 

q2p 

QP 

Q2P 

Ql  +  P2 

QP 

C 

QP 

Q2P 

Q}  + 

However,  (B-I)  has  no  inverse  as  the  rows  are  linearly  dependent.  The  classical  method  of 
solution  to  this  problem  is  to  drop  one  of  the  identity  equations  of  7r  and  substitute  the  fact 
that  the  sum  of  the  steady  state  probabilities  must  equal  one.  That  is,  ir,  +  n2  +  +  tt4  =  1. 

Making  this  substitution  for  column  3  of  B-l  yields  the  following  system  of  simultaneous  equa¬ 
tions:  tt  A  —  (0,0. 1.0),  where: 


Q}  +  P  -  1  QP  10 

Q2P  Q>  +  P  -  11  Q2P 

A  =  QP  Q2P  1  QP 

0  QP  1  Q}  +  P  -  1 

Thus,  n  —  (0,0,  l,0)/4_l  which  reduces  to  w  —  A2]  where  A)X  is 
inverse  matrix  A~\  The  solution  to  the  last  system  of  four  equations 


the 

and 


third  column  of  the 
four  variables  is: 


w,  =  (02+  Q  +  l)/(402  +  3{?  +  5) 
tt2  -  «?2  +  Q  +  2)/(AQ2  +  3(?  +  5) 
7t3=  (02+  l)/(4(?2  +  3(3  +  5) 

7t4  —  (Q2  +  Q  +  W(4Q2  +  3Q  +  5) 


We  are  now  able  to  directly  compute  the  steady  state  reliability  of  a  two  buffer  series  sys¬ 
tem  where  each  stage  has  identical  reliability,  Q ,  distributed  Bernoulli  and  each  buffer  a  capacity 
of  one  unint.  We  have  just  proved  Theorem  4  which  results  from  (7)  and  (9). 
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THEOREM  4:  For  the  series  production  system  described  above  the  steady  state  reliabil¬ 
ity  of  the  system  R,  is  equal  to: 

R  _  (2s  ±  2Q 4  +  4 Q1  ±  3 Q2  ±  20 

4Q2  +  1Q  +  S 

4.  EXTENSION  OF  THE  GENERAL  MUTLIBUFFER  CASE 

The  previous  sections  have  laid  the  groundwork  for  our  analysis  of  a  general  multistage, 
multibuffer  system  such  as  the  one  depicted  in  Figure  3.  For  ease  of  analysis  let  us  assume 
that  the  reliability  of  each  stage  has  the  Bernoulli  distribution  with  parameter  @and  further  that 

m 

buffer  »  has  capacity  A/,.  For  a  general  N  stage  system  with  m  buffers,  there  are  (A/,  +  1) 

/-I 

possible  buffer  states;  i.e.,  each  buffer  may  take  on  M ,  +  1  values  and  there  are  m  such  buffers. 
For  example,  if  M,  —  4  for  all  /,  and  m  —  5  there  would  be  3,125  possible  buffer  states  ranging 
in  value  from  (0,0, 0,0,0)  to  (4, 4, 4,4,4).  The  question  arises  as  to  the  viability  of  this  form  of 
analysis  for  systems  with  large  buffer  capacity  (A/,),  multiple  buffers  ( m )  or  a  combination  of 
the  two.  Clearly,  the  transition  matrix  for  a  large  system  would  be  relatively  sparse  (i.e.,  many 
zero  entries).  For  example,  in  a  four  stage  (three  buffer)  system,  where  each  buffer  has  a  capa¬ 
city  of  three  units,  there  would  be  43  -=  (3  +  l)3  or  64  possible  transition  states.  For  the  start¬ 
ing  state  (1,1,1)  there  are  13  possible  transitions  (i.e.,  nonzero  transition  probabilities).  The 
feasible  transitions  from  the  state  (1,1,1)  are: 

(0.1.1),  (0,1,2).  (0,2,1),  (1.0.1).  (1.0,2).  (1.1.0).  (1.1.1). 

(1,1,2),  (1,2,0).  (1.2.1).  (2,0.1).  (2,1,0),  and  (2.1.1). 


Raw 

Material 


M , 


Af, 


B 


Ms 


v-i 


Sy 


Ms 


Finished 

Goods 


Product  Flow 

Fiouke  .V  General  multistage.  muliibufTer  system 


While  it  is  obvious  that  the  method  of  analysis  employed  to  this  point  is  feasible,  that  is, 
(1)  definition  of  a  one  step  transition  matrix.  (2)  development  of  a  reliability  equation  as  a 
function  of  stage  reliability  and  the  steady  state  transition  probabilities,  and  (3)  solving  a  system 
of  linear  equations  for  the  steady  state  transition  probabilities;  its  application  is.  for  the  most 
part,  not  practical. 

Let  us  present  the  transition  matrices  for  two  or  three  buffer  systems  with  capacity  one  or 
two.  For  the  system  of  two  buffers  of  capacity  two  the  transition  matrix  is  given  in  Figure  4 
and  the  steady  state  probabilities  for  various  values  of  Q  are  given  in  Figure  5  where  the  relia¬ 
bility  R  is: 

R  ■*  Qiir  |  +  Qn2  +  Qitt,  +  02rr4  4-  @7^  +  Qrrb  +  Q2rr2  +  Qng  +  Qnq. 

Figure  5  was  calculated  by  a  small  computer  program.  For  various  values  of  (?,  we  solved  for 
the  unique  it,  and  calculated  R,  which  appears  in  Figure  5.  For  the  system  of  four  stages,  and 
three  buffers  with  capacity  one,  the  transition  matrix  is  given  in  Figure  6. 

Again,  using  a  small  computer  program  we  solved  for  it ,  and  calculated  R.  The  steady 
state  probabilities  and  the  system  reliability  R  is  given  in  Figure  7  where 

r  -  Q*n\  +  qtt2  +  oVi  +  Q”  4  +  c?y< i  +  +  c?y  +  Q*  8- 
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Transition  Matrix 
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Buffer  Capacity  Equals  2 
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The  approach  we  present  here  can  be  summarized  as  follows;  for  a  given  configuration  of 
a  serial  production  line  with  multiple  buffers  and  no  restriction  on  their  capacity,  one  can  write 
the  one  step  transition  probability  matrix  and  solve  for  its  steady  state  probabilities  which  yields 
the  reliability  of  the  line.  The  method  is  efficient  for  a  small  number  of  buffers  and  small  capa¬ 
cities.  In  general,  the  number  of  state  variables  and  the  number  of  linear  equations  are 

m 

fl  (M,  +  1)  for  m  buffers  with  capacity  M,. 
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ABSTRACT 

Consider  a  set  of  task  pairs  coupled  in  lime:  a  first  (initial)  and  second 
(completion)  tasks  of  known  durations  with  a  specified  time  between  them.  If 
the  operator  or  machine  performing  these  tasks  is  able  to  process  only  one  at  a 
time,  scheduling  is  necessary  to  insure  that  no  overlap  occurs.  This  problem 
has  a  particular  application  to  production  scheduling,  transportation,  and  radar 
operations  (send-receive  pulses  are  ideal  examples  of  time-linked  tasks  requir¬ 
ing  scheduling).  This  article  discusses  several  candidate  techniques  for  schedule 
determination,  and  these  are  evaluated  in  a  specific  radar  scheduling  applica¬ 
tion. 

This  article  considers  the  problem  of  scheduling  task  pairs,  i.e.,  tasks  which  consist  of  two 
coupled  tasks,  an  initial  task  and  a  completion,  separated  by  a  known,  fixed  time  interval.  If 
the  operator  or  machine  performing  these  tasks  is  only  able  to  process  one  at  a  time,  scheduling 
is  necessary  to  insure  that  a  completion  task  of  one  pair  does  not  arrive  for  processing  while 
one  part  of  another  task  is  being  processed. 

Consider,  for  example,  a  radar  tracking  aircraft  approaching  a  large  airport  [1].  In  order 
to  track  adequately,  it  is  necessary  to  transmit  pulses  and  receive  the  reflection  once  every 
specified  update  period.  The  radar  cannot  transmit  a  pulse  at  the  same  time  that  a  reflected 
pulse  is  arriving  nor  can  two  reflected  pulses  overlap.  A  possible  strategy  is  to  transmit  to  one 
tracked  object  and  wait  for  that  pulse  to  return  before  another  pulse  is  transmitted  as  shown  in 
Figure  1(a),  but  unless  the  number  of  objects  being  tracked  is  small,  this  may  not  allow  all 
objects  to  be  tracked  in  each  update  period.  A  more  efficient  strategy  is  some  form  of  inter¬ 
laced  scheduling  like  that  shown  in  Figure  1(b).  Observe  that  the  time  between  each  pair  of 
transmit  and  receive  pulses  is  the  same  in  Figure  1(b)  as  in  Figure  1(a),  yet  the  total  transmis¬ 
sion  time  is  far  less  in  1(b). 
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1.  NOTATION,  CLASSIFICATION,  AND  COMPLEXITY 

Our  object  is  to  generate  a  schedule  for  a  given  set  of  task  pairs  which  allows  that  set  to 
be  completed  in  the  least  possible  time  with  no  overlap  between  tasks  (Figure  2).  Formally,  let 

h  =  the  time  of  initiation  of  the  /th  task  pair; 

S,  =  the  duration  of  the  initial  task  of  the  /th  pair,  /  —  1,  2 . N ; 

Tj  «  the  duration  of  the  completion  task  of  the  /th  pair,  /  -  1,  2,  ....  N; 

4  —  the  "inter-task"  duration,  i.e.,  the  time  between  the  initiation  of 

the  initial  task  of  the  /th  pair  and  the  initiation  of  that  pair's  completion. 


The  time  between  the  initiation  of  the  first  task  pair  and  the  completion  of  the  final  pair  we 
refer  to  as  the  frame  time  (or  makespan,  cf.  [3,4]  denoted  z).  For  convenience,  we  will  set  the 
initiation  time  of  the  first  pair  to  0. 

The  scheduling  problem  may  be  stated  as 

find  t,  ^  0.  /  =  1 . N  to  minimize 

z  =  max,  (r,  +  4  +  T,) 

subject  to  the  constraint  that  no  member  of  the  set  of  intervals 

{(/,,  t,  +  S,).  (/,  +  4.  r,  +  4  +  V)  /  -  1 . N 

overlap  with  any  other  member. 

To  put  this  problem  into  context  with  much  of  the  recent  literature  classifying  scheduling 
problems  with  regard  to  their  computational  complexity,  we  observe  that  the  problem  as  stated 
is  equivalent  to  a  job  shop  problem  where  N  jobs  are  to  be  scheduled  on  two  machines  with  the 
following  characteristics*: 

1.  Each  job  requires  three  operations:  the  first  (of  duration  S,)  to  be  processed  on 
Machine  1;  the  second  (of  duration  d,  -  S,)  on  Machine  2;  the  third  (of  duration  T) 
again  on  Machine  1. 

2.  Machine  I  may  only  process  one  operation  at  a  time;  Machine  2,  however,  has 
infinite  processing  capacity. 


'Under  the  classificalion  scheme  of  Ri nooy  Kan  [9|,  this  problem  is  V|2l  <7.  ""  »</".  t/2  ww  ho/i  See  also  181 . 
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3.  No  waiting  between  operations  is  permitted.  That  is,  once  a  job  is  begun,  it  must 
proceed  from  Machine  1  to  Machine  2  and  back  again  to  Machine  1  with  no  delay. 

The  problem  can  then  be  shown  to  be  NP-complete  by  Theorem  5.7,  pg.  93  in  [9]  or  by  a 
reduction  from  KNAPSACK  in  [6].  NP-complete  problems  form  an  equivalence  class  of  com¬ 
binatorial  problems  for  which  no  nonenumerative  algorithms  are  known.  If  an  '  efficient"  algo¬ 
rithm  were  constructed  which  could  solve  any  problem  in  this  class,  any  other  would  also  be 
solvable  in  polynomial  time  (cf.  [2, 4,6, 7]).  Members  of  this  class  include  the  chromatic 
number  problem,  the  knapsack  problem,  and  the  traveling  salesman  problem. 

The  fact  that  a  polynomial-bounded  algorithm  is  not  likely  to  exist  motivates  the  construc¬ 
tion  of  several  polynomial-bounded  algorithms  which  are  presented  and  evaluated  in  Sections  2 
and  3.  An  integer  programming  formulation  leads  to  a  straightforward  branch  and  bound  pro¬ 
cedure  which  makes  use  of  the  problem’s  special  structure.  (See  [11].)  In  view  of  the  fact  that 
this  optimal  procedure  is  likely  to  be  tractable  only  for  very  small  problems,  and  not  even  then 
for  radar-like  applications  requiring  real  time  solution,  we  proceed  directly  to  consideration  of 
three  suboptimal  algorithms. 

2.  SUB-OPTIMAL  ALGORITHMS 

This  section  considers  scheduling  procedures  which  can  be  shown  to  be  polynomially 
bounded:  Sequencing,  Nesting  and  Fitting.  After  some  discussion  of  their  characteristics,  they 
will  be  evaluated  on  realistic  examples  in  Section  3. 

Sequencing 

An  ordered  set  of  p  task  pairs  are  said  to  be  sequenced  when  the  completion  tasks  arrive 
for  processing  in  the  same  order  as  the  initial  tasks  were  scheduled,  p  pairs  can  be  sequenced 
whenever 

(1)  d\  >  £  S,  and 

.-1 

(2)  d,  >  +  r(_,  -  / 

If,  as  is  the  case  for  many  applications,  S, 

(3) 

and  implementation  of  this  procedure  becomes  quite  easy. 

We  may  think  of  this  procedure  as  "jamming"  initial  tasks  together  until  they  run  into  the 
completion  task  corresponding  to  the  first  initial  task.  The  completion  tasks  are  guaranteed  not 
to  overlap  since  each  succeeding  d,  is  at  least  as  large  as  the  one  before.  Also,  since  this  is  a 
"single-pass"  procedure  (cf.  [3]),  compulation  time  is  linear  in  N.” 

In  any  sequenced  />set,  dead  time  can  occur  in  two  ways,  as  is  shown  in  Figure  3.  It 
occurs  between  the  last  initial  task  and  the  first  completion,  and  it  occurs  between  successive 
completions.  The  former  can  be  written  as 

d\~t  S, 

i—  I 


-2.  3 . p. 

=  T,  for  each  task  pair,  (2)  becomes  simply 


Actually,  computation  lime  is  Ot/V  log  AO  since  the  d,  have  to  be  ordered. 
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Figure  3  Sequencing 


and  the  latter  as 

£  -  4  -  4>  ~  d\- 

i-i 

Hence, 

-.¥«■<?  =  ^  (5,  +  r,)  +  </|  -  £  5,  +  d,,  -  d, 

i-i  i-i  ( 

-  t  r  +  4- 

i-i 

Hence,  if  A'  task  pairs  are  sequenced  in  P  p- sets,  the  fah  set  having  pk  pairs,  k  =  1,  2 . 7\ 

the  total  frame  time  may  be  represented  as 

P  [  A  N  P 

ZSEQ  “  I  I  ?i  +  4\  ”  Z  T<  +  Z  4v 

*— 1  1  i-l  k-l 

As  an  example,  consider  the  following  7  task  pairs  with  common  durations  for  initial  and 
completion  tasks,  ordered  by  increasing  d,. 

/  =  l:S|-  r,-2.  rf,-9 
/-  2:  S2  =  r2=  1,  d2  =  13 
/'  -  3:  S,  -  r3  =  2.  r/j  -  15 

/  =  4.  54  =  r4  =  3.  </4  =  15 

i  -  5:  S5  -  7',  =  2.  </5  =  19 

'  -  6:  S„  -  rh  -  4,  dt  -  24 

/  =  7:  S7  =  7~?  -  3.  </7  -  25. 

Figure  4(a)  shows  their  sequenced  schedule. 

For  comparison.  Figure  4(b)  shows  the  optimal  schedule  for  this  set  of  task  pairs  as  gen¬ 
erated  by  the  branching  algorithm  alluded  to  above.  At  the  other  extreme,  if  these  pairs  were 
scheduled  by  waiting  until  each  pair  was  completely  processed  before  initiating  the  next,  the 
frame  time  would  be  138 

|r  -  £  (4  +  T)  -  138.1 


Nesting 

An  ordered  set  of  p  task  pairs  are  said  to  be  nested  whenever  the  completion  tasks  arrive 
for  processing  in  the  reverse  of  the  order  in  which  the  initial  tasks  were  scheduled,  p  pairs  may 
be  nested  if 
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V,  S,  SyS. 
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47  50  S2  56  57 

64  70 

Figure  4.  Sequencing  and  nesting 


(4)  diH+  rl+l  +  s„  i=i,  ,  p-i.  . 

Applying  this  procedure  to  the  7-pair  example  discussed  above  gives  the  schedule  shown 
in  Figure  4(c)  with  z  —  70. 

Fitting 

This  procedure,  unlike  the  two  discussed  above,  allows  the  user  to  specify  a  priority  order¬ 
ing,  and  corresponds  intuitively  to  the  simple  process  which  one  might  use  when  scheduling 
task  pairs  by  hand.  After  setting  the  desired  order  and  scheduling  the  first  task  pair  at  time  0, 
each  successive  pair  is  scheduled  at  the  earliest  possible  time  not  involving  any  overlap  with 
pairs  already  scheduled. 

Let  us  consider  this  procedure  for  the  above  example,  taking  an  arbitrary  ordering: 
2,6,7,4,3,1,5.  As  shown  in  Figure  5(a),  the  task  pair  is  scheduled  at  time  0,  and  pairs  6  and  7 
can  successively  be  scheduled  with  no  overlap.  If  we,  however,  try  to  schedule  pair  4  at  the 
first  available  time,  its  completion  would  overlap  with  pair  6's  completion  (Figure  5(b)),  so  this 
is  not  possible.  The  first  available  time  for  scheduling  task  pair  4  without  overlap  is  time  18 
(Figure  5  (c)).  Pair  3,  however,  having  task  duration  only  2,  can  be  scheduled  at  time  8  (Fig¬ 
ure  5(d)).  Observe  now  that  pair  1  can  be  scheduled  nowhere  in  the  existing  schedule  without 
overlap,  so  it  must  be  "tacked"  onto  the  end,  at  time  36  (Figure  5(e)).  Pair  5  is  scheduled  at 
time  21,  completing  the  schedule  with  z  —  47  (Figure  5(f)). 

3.  TASK  PAIR  SIMULATION  AND  NUMERICAL  RESULTS 

In  keeping  with  the  radar  application  mentioned  above,  a  simulation  has  been  developed 
to  generate  aircraft  configurations  suitable  to  radar  operation.  For  each  object,  range,  cross- 
section,  and  velocity  can  be  used  to  determine  the  necessary  length  of  transmit  and  receive 
pulses  (of  the  order  of  10-100  Msecs.)  as  well  as  the  inter-pulse  distance  (of  the  order  of  300- 
1300  Msecs.).  Thus,  a  list  of  task  pairs  can  be  generated  for  evaluation  of  the  procedures  out¬ 
lined  in  the  previous  section.  As  an  example,  such  a  list  is  given  in  Table  I  for  N  ■*  20. 

For  values  of  N  shown  in  Table  II,  the  simulation  generated  50  such  task  pair  lists,  and 
the  average  frame  time  and  computation  time  were  computed.  Figure  6  presents  this  data 
graphically.  Note  that,  as  one  would  expect,  frame  time  is  linear  in  N.  This  is  not  surprising 
since  in  the  best  conceivable  situation,  that  of  no  idle  time  between  subtasks, 
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TABLE  I  —  Sample  Task  Pair  List  (N  “  20) 
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TABLE  II  (a)  —  Average  Simulated 
Frame  Times 


N 

20 

SEQUENCE 

4.4 

(.414) 

NEST 

7.3 

(1.387) 

FIT 

4.0 

(.381) 

50 

8.6 

(.559) 

15.0 

(1.830) 

7.6 

(.452) 

100 

15.5 

(.759) 

27.2 

(2.747) 

13.8 

(.679) 

200 

29.2 

(1.197) 

52.2 

(4.683) 

27.4 

(.990) 

500 

70.8 

(2.188) 

119.2 

(8.391) 

66.1 

(1.590) 

Frame  limes  in  msec. 

Quantities  in  parentheses  are  standard  deviations 
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N 

20 
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1.9 
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4.0 

FIT 

67.2 

50 

4.4 

16.3 

440.1 

100 

8.6 

51.9 

1742 

200 

16.9 

195.3 

7091 

500 

42.0 

1064 

44160 

496 


R.D  SHAPIRO 


~*+~.  - 


An  assumption  made  in  the  treatment  of  this  example  is  that  the  radar  operator  knows  the 
values  of  S,.  T,  and  d,  precisely.  If  there  is  any  uncertainty,  signals  can  overlap.  A  straightfor¬ 
ward  way  of  avoiding  this  problem  in  a  real  situation  where  uncertainty  would  obviously  be 
present  would  be  to  "open  a  window"  around  the  pulse.  That  is,  if  the  object  is  such  that 
transmit  and  receive  pulses  are  estimated  to  be  of  60  Msec,  duration,  an  interval  longer  than  60 
Msec,  can  be  allotted  to  these  pulses  to  accommodate  (1)  the  possibility  that  a  pulse  length 
longer  than  60  Msec,  might  be  necessary  or,  more  important,  (2)  the  possibility  that  the  receive 
signal  might  arrive  sooner  or  later  than  expected.  This  procedure  offers  no  conceptual  difficulty 
since  the  window  around  the  pulses  may  be  made  large  enough  to  guarantee  that  the  probability 
of  overlap  is  as  small  as  required.  In  order  to  retain  frame  times  small  enough  to  allow  updat¬ 
ing  every,  say,  200  milliseconds,  we  must  limit  the  size  of  the  window  somewhat.  This  does 
not  seem  to  be  a  severe  restriction,  however.  For  example,  since  frame  time  is  linear  in  XT',, 
opening  a  window  around  each  pulse  of  twice  that  pulse's  estimated  duration  would  cause  the 
frame  time  to  be  no  more  than  doubled.  The  frame  times  of  sequenced  pulses  in  Table  11(a) 
indicate  that  even  for  large  A,  this  is  no  problem. 

A  second  possibly  problematic  characteristic  of  the  example  is  that  it  is  static,  i.e.,  no 
explicit  consideration  is  given  to  new  "jobs"  added  to  the  system  during  the  scheduling  process. 
In  job  shop  scheduling,  this  may  present  no  problem  if  jobs  are  released  to  the  shop  at 
predetermined  times.  In  radar  tracking,  however,  one  cannot  hold  enemy  missiles,  and  the 
scheduler  must  be  dynamic.  This  can  be  accomplished;  the  new  targets  may  be  inserted  into 
the  queue  of  jobs  to  be  processed,  or,  since  this  is  likely  to  be  time-consuming  when  jobs  are 
ordered  (as  in  Sequencing  and  Nesting),  all  current  jobs  can  be  processed,  followed  by  the 
newly-arrived  entries.  This  procedure  will  be  especially  efficient  for  sequencing  since  the  d's 
are  proportional  to  the  distance  between  radar  and  target,  and  new  targets  will  tend  to  appear  at 
approximately  the  same  range. 

The  necessity  to  allow  for  search  and  discrimination  as  well  as  the  tracking  activity  and 
real-time  schedule  determination  within  a  200  milli-second  period  makes  sequencing  the  only 
viable  alternative.  Even  when  real-time  processing  is  not  required,  one  wonders  whether  the 
slight  improvement  in  frame  time  allowed  by  fitting  warrants  the  extra  computational  burden. 

A  caveat  is  in  order  here:  these  results  are  somewhat  application-dependent.  It  is  quite 
possible  that  other  applications  which  produce  task  pairs  with  different  structures  will  lead  to 
different  conclusions. 

CONCLUDING  REMARKS 

In  the  above  discussion  it  has  been  assumed  that  the  operator  or  machine  can  process 
only  one  task  segment  at  a  time.  This  is  appropriate  for  the  application  being  considered,  but 
one  might  easily  imagine  instances  in  which  there  is  some  nonunit  capacity  constraint  on  the 
operator.  For  example,  if  trucks  are  being  loaded  and  unloaded  at  some  central  depot,  labor  or 
space  restrictions  might  limit  the  number  of  trucks  being  simultaneously  processed. 

Fortunately,  the  suboptimal  procedures  described  above  may  be  extended  without  any 
problem.*  Figure  7  shows  how  the  example  given  in  Section  2  may  be  sequenced  if  the  operator 
is  limited  to  two  tasks  at  a  time.  Note  that  due  to  the  ordering  of  the  inter-task  durations, 
sequencing  guaranteed  that  since  no  more  than  two  initial  tasks  can  overlap,  no  more  than  two 
final  tasks  will  overlap. 


'The  optimal  cmimerativc  procedure  described  in  III)  is  also  easily  extended 
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Another  extension  is  to  consider  tasks  which  consist  of  more  than  two  coupled  segments. 
The  notation  changes  slightly:  the  rth  task  pair  becomes  a  task  set,  the  initial  task  of  duration 
S,  followed  by  n ,  subtasks;  the  yth  subtask  is  of  duration  TtJ  and  the  time  at  which  it  is  initiated 
is  djj  after  the  initiation  of  the  initial  task  (Figure  8). 


I 


i 


I 


t 


Fitting,  as  proposed  above,  works  well  in  this  case,  but  sequencing  and  nesting  are  waste¬ 
ful  since  they  treat  the  subtasks  as  one  long  task  of  duration  d„,  +  T„,  -  d, 
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ABSTRACT 

This  paper  examines  problems  of  sequencing  rt  jobs  for  processing  by  a  sin¬ 
gle  resource  to  minimize  a  function  of  job  completion  times,  when  the  availa¬ 
bility  of  the  resource  varies  over  time.  A  number  of  well-known  results  for 
single-machine  problems  which  can  be  applied  with  little  or  no  modification  to 
the  corresponding  variable-resource  problems  are  given.  However,  it  is  shown 
that  the  problem  of  minimizing  the  weighted  sum  of  completion  times  provides 
an  exception. 


1.  INTRODUCTION 

We  consider  the  problem  of  sequencing  a  set  A  -  {1,2,  ...  ,  it)  of  jobs  to  be  processed 
using  a  single  homogeneous  resource,  where  the  availability  of  the  resource  varies  over  time. 
If  t  represents  time  (measured  from  some  origin  t  =  0)  then  we  denote  by  r(t)  the  resource 
available  at  time  t  and  by  R(t), 

R  (r)  =  J*o  r{u)du 

the  cumulative  availability  as  of  time  /,  i.e.,  the  area  under  the  curve  r(u)  over  the  interval 
[0,/].  See  Figure  1. 

Let  Pj,  j  —  1 . »,  denote  the  resource  requirement  of  job  j.  Once  Pj  units  of 

resource  have  been  applied  to  job  j,  the  job  is  considered  complete.  We  denote  the  completion 
time  of  job  j  by  Cj.  In  all  problems  treated  the  objective  is  is  to  minimize  (J,  a  function  of  the 
completion  times  of  the  jobs,  where  G  is  assumed  to  be  a  regular  measure  (see  Cl],  Chapter  2). 

This  model  is  a  generalization  of  the  single-machine  sequencing  model.  The  generaliza¬ 
tion  to  a  resource  capacity  that  varies  over  time  allows  for  situations  in  which  machine  availabil¬ 
ity  is  interrupted  for  scheduled  maintenance  or  temporarily  reduced  to  conserve  energy.  It  also 
allows  for  a  situation  in  which  processing  requirements  are  stated  in  terms  of  man-hours  and 
labor  availability  varies  over  time. 
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In  the  single-machine  case  the  resource  profile  r(/)  is  constant  (typically  r(t)  “  1),  and 
the  cumulative  profile  RU)  is  a  straight  line  with  slope  ril).  Time  is  measured  in  some  basic 
unit  such  as  hours;  and  completion  times,  ready  times,  due  dales  and  tardiness  are  expressed  in 
the  same  units.  Resource  requirements  (processing  times)  are  simply  requirements  for  inter¬ 
vals  on  (he  time-axis. 

In  the  variable-resource  problem,  the  exact  correspondence  between  the  requirement  for  a 
unit  of  resource  and  the  requirement  for  a  unit  interval  on  the  time  axis  is  lost.  This  lack  of 
correspondence  arises  from  the  fact  that  there  may  be  a  number  of  units  of  resource  available 
during  a  particular  unit  of  time  and  a  different  number  during  the  next.  In  the  single-machine 
problem  if  a  job  j  is  sequenced  to  follow  jobs  in  B  (where  B  is  any  subset  of  N  )  then  job  ./'  will 
be  complete  at  time  Cn 

C,  =  p(B)  +  pr 

where  p(B)  =  £/>,.  and  p ,  denotes  the  processing  time  for  job  /.  In  the  variable-resource 

M.  H 

problem  it  is  appealing  to  analogously  specify  the  completion  time  of  job  j  by 

(1)  C,  =  <(p(B)  +  p,) 

where  p,  is  the  resource  requirement  of  job  /  and  r(Q)  is  the  'smallest)  point  on  the  time  axis 
corresponding  to  R  (r)  —  Q.  See  Figure  1.  In  effect,  jobs  are  sequenced  on  the  resource  axis, 
while  their  completion  times  are  measured  on  the  time  axis.  For  the  single-machine  problem 
the  completion  point  of  job  ./'  is  the  same  on  both  axes,  but  such  is  not  the  case  for  the 
variable-resource  problem. 

Notice  that  this  specification  implicity  assumes  that  the  resource  available  at  any  point  in 
time  is  devoted  entirely  to  the  processing  of  a  single  job.  Thus,  for  example,  if  ten  men  were 
available  in  a  particular  hour,  all  ten  would  be  assigned  to  work  simultaneously  on  the  same 
job.  Also,  if  the  available  resource  represents  several  machines,  then  this  formulation  permits 
each  job  to  be  processed  simultaneously  on  more  than  one  machine.  Equivalently,  this  means 
that  jobs  must  be  divisible  into  portions  that  can  be  allocated  equally  to  the  number  of 
machines  available.  Such  a  formulation  will  be  called  a  continuous-time  model. 

In  order  to  allow  for  a  wider  range  of  applicability,  we  can  re-formulate  the  model  in 
discrete  time  as  follows. 

(a)  Unit  intervals  on  the  lime  axis  (of  Figure  I)  are  called  periods,  and  job  comple¬ 
tion  times  are  measured  in  periods. 

(b)  In  a  given  period  the  resource  availability  is  an  integer  number  of  units. 

(c)  Each  job  requires  an  integer  number  of  resource-periods. 

(d)  Processing  work  is  divisible  only  to  the  level  of  one  resource-unit  for  one  period. 

Under  this  formulation,  for  example,  the  time  unit  might  be  days,  the  resource  availability 
might  be  crew  size,  and  the  processing  requirement  might  be  man-days.  Property  (d)  then  res¬ 
tricts  the  refinement  of  a  schedule  to  the  assignment  of  each  crew  member’s  task  on  a  day-by- 
day  basis.  Furthermore,  a  task  requiring  two  man-days  could  be  accomplished  either  by  one 
crew  member  working  two  days  or  by  two  members  working  one  day  each. 

In  the  discrete-time  context,  we  may  regard  sequencing  as  ordering  jobs  on  the  resource 
scale  in  Figure  1,  but  taking  the  completion  time  of  job  j  to  be  [C,]  vs  [C,l,  the  smallest 
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integer  greater  than  or  equal  to  Cy,  where  Cj  itself  is  given  by  (1).  In  other  words,  we  obtain  a 
sequence  using  the  continuous-time  framework,  which  assumes  arbitrarily  divisible  jobs,  but  we 
round  up  the  resulting  completion  times  when  they  are  noninteger.  Under  this  interpretation 
of  the  model,  due-dates  are  specific  days  and  a  job  is  "on  time"  as  long  as  it  is  completed  on  or 
before  the  specific  day.  Clearly,  in  the  discrete  time  model  several  jobs  can  have  the  same 
completion  time. 

To  verify  that  a  job  sequence  can  be  interpreted  consistently  with  requirement  (d),  note 
that  the  cumulative  resource  requirement  and  the  cumulative  resource  availability  by  the  end  of 
any  period  are  both  integers.  It  follows  that  the  workload  implied  by  the  continuous-time  solu¬ 
tion  can  be  shifted  to  meet  the  integer  restrictions  of  the  discrete-time  model  since  the  resource 
availability  in  any  period  can  be  treated  as  a  set  of  unit-resource  availabilities.  Then  any  frac¬ 
tion  of  a  day’s  work  in  the  original  solution  can  be  rescheduled  as  a  day’s  work  for  the  same 
proportion  of  the  total  resource  units  available.  This  rescheduling  will  consume  an  integer 
number  of  resource-periods  for  each  job. 

As  an  example,  consider  the  three-job  problem  shown  below. 


j 

1 

2  3 

Pi 

7 

3  6 

r(t)  =1  0  <  /  <  4 
,(,)  =  4  4</<7 

In  Figure  2  we  represent  the  sequence  1-2-3  assuming  infinite  divisibility.  In  Figure  3  we  show 
how  the  work  is  rescheduled  to  meet  the  integrality  requirement  of  the  discrete-time  model. 
As  Figures  2  and  3  indicate,  the  discrete-time  conditions  can  be  incorporated  by  a  minor  adjust¬ 
ment  of  continuous-time  job  assignments  that  essentially  involves  replacing  vertical  portions  of 
the  schedule  chart  with  horizontal  portions  whenever  the  available  resource  capacity  is  split 
among  two  or  more  jobs  within  a  period. 


l  ull  HI  2. 


Our  purpose  in  this  paper  is  to  note  that  certain  well-known  results  for  the  single-machine 
model  carry  over  with  little  or  no  modification  to  the  variable-resource  model.  In  fact,  we 
found  only  one  exception.  (See  Section  3.) 


A  variable  resource  problem  has  also  been  examined  by  Gelders  and  Kleindorfer  (6,7)  in 
the  context  of  coordinating  aggregate  and  detailed  scheduling  decisions.  In  their  model  the 
variation  in  resource  availability  results  from  the  explicit  decision  to  schedule  overtime.  This 
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decision  leads  to  a  cumulative  resource  availability  function  consisting  of  segments  with  identi¬ 
cal  positive  slope  (corresponding  to  capacity  available)  separated  by  horizontal  segments 
(corresponding  to  unused  overtime.)  Their  objective  is  to  determine  when  and  how  much 
overtime  should  be  scheduled,  and  to  determine  the  associated  job  sequence,  so  as  to  minimize 
the  sum  of  overtime,  tardiness  and  flow-time  costs.  They  also  note  that  for  a  given  overtime 
schedule,  shortest-first  sequencing  minimizes  mean  job  completion  time  while  nondecreasing 
processing  time-to-weight  ratio  sequencing  may  not  minimize  mean  weighted  job  completion 
time.  These  two  results  are  encompassed  in  our  general  treatment  of  the  variable-resource 
model  in  Sections  2  and  3. 
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2.  RESULTS  THAT  GENERALIZE  TO  VARIABLE  RESOURCES 

The  following  is  a  set  of  sequencing  results  for  the  variable-resource  model  that  are  ident¬ 
ical  to  or  slight  modifications  of  their  single-machine  counterparts.  It  is  not  difficult  to  establish 
that  the  results  we  give  are  valid  for  both  the  continuous-time  and  discrete-time  models.  How¬ 
ever,  proofs  are  omitted,  since  they  are  typically  direct  extensions  of  the  original  arguments  in 
the  single-machine  case. 

The  results  involve  sequences  of  jobs,  or  at  least  partial  sequences.  We  reiterate  that 
these  sequences  can  be  viewed  as  applying  to  the  resource  axis  in  Figure  I  but  can  be  converted 
to  completion  time  schedules  in  either  the  continuous-time  or  discrete-time  case  by  means  of 
the  appropriate  transformation.  We  use  C,  to  denote  the  completion  time  of  job  j  and  i(p(B )) 
to  denote  the  makespan  for  the  jobs  in  B ,  recognizing  that  in  the  discrete-time  case  these  quan¬ 
tities  must  be  interpreted  in  the  appropriate  way. 

Minimizing  the  Maximum  Cost 

One  of  the  few  efficient  algorithms  for  a  broad  class  of  sequencing  criteria  is  Lawler's  pro¬ 
cedure  (9]  for  minimizing  the  maximum  cost  in  the  sequence.  Formally,  the  criterion  is  to 
minimize 

G  -  max  (ff,(C;)| 

where  *,((',)  is  the  cost  incurred  by  job  j  when  it  completes  at  C,  and  where  g,(r)  is  nonde¬ 
creasing  in  i.  The  solution  procedure  works  by  constructing  a  sequence  from  the  back  of  the 
schedule  and  the  procedure  is  easily  adapted  to  the  variable-resource  model,  as  shown  below. 
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1.  Initially  let  A  -  4>-  (A  denotes  the  set  of  jobs  at  the  end  of  the  schedule  and 
A'  —  N  —  A  denotes  its  complement.) 

2.  Find  M  —  t(p(A')).  (M  is  the  makespan  for  unscheduled  jobs.) 

3.  Identify  job  A  satisfying  gk  (M)  -  min  {^(A/)J.  (Considering  only  the 

it  A' 

unscheduled  jobs,  job  k  is  the  one  that  achieves  the  minimum  cost  when 
scheduled  last.) 

4.  Schedule  job  k  last  among  the  jobs  in  A '.  Then  add  job  k  to  A  and  return  to  Step 
2  until  A  —  N. 

A  noteworthy  special  case  is  the  criterion  of  maximum  tardiness.  The  procedure 
sequences  jobs  in  nondecreasing  order  of  due-dates  in  this  case.  Thus,  as  in  the  single  machine 
problem,  earliest  due-date  (EDD)  scheduling  will  minimize  the  maximum  tardiness.  It  will 
also  find  a  schedule  in  which  all  jobs  complete  on  time,  if  such  a  schedule  exists. 

Minimizing  the  Sum  of  Tardiness  Penalties 

Many  problems  of  considerable  interest  for  the  single  machine  model  may  be  regarded  as 
special  cases  of  the  problem  of  minimizing  total  tardiness  penalty, 

G  =  £  tv,T, 

l(  v 

where  T,  =  max(C,  -  d,\  0)  and  tv,  >  0. 

Several  dominance  properties,  in  the  spirit  of  Emmons  [31,  can  be  shown  to  hold  for  the 
variable  resource  problem.  These  in  turn  imply  similar  dominance  properties  for  the  various 
special  cases  and,  in  some  instances,  optimizing  (ranking)  procedures.  Let: 

J  =  a  set  of  jobs 

J'  =  the  complement  of  J 

A,  =  the  set  of  jobs  known  to  follow  job  »,  by  virtue  of  precedence  conditions. 

B,  =  the  set  of  jobs  known  to  precede  job  »,  by  virtue  of  precedence  conditions. 

Cj  =  the  time  required  to  process  the  jobs  in  set  J,  defined  by  R(Cj)  =  ]Tp, 

/«  j 

B’  =  B,U  (  j\  —  the  set  containing  job  j  and  all  jobs  known  to  precede  job  /  by  virtue  of 
the  precedence  conditions. 

A’  =  A'  -  [j  |  =  the  set  containing  the  complement  of  A ,,  but  excluding  job ./. 

THEOREM  1:  If  wk  <  w,  and  dk  ^  ma \(d,.  C  . )  then  ./  precedes  k  in  an  optimal 
sequence. 

THEOREM  2:  If  dk  >  CA  then  ./  precedes  k  in  an  optimal  sequence. 

THEOREM  3:  If  p,  <  pk ,  >v,  ^  >vk  and  d,  <  max  (</*,  GB*  h  then  j  precedes  A  in  an 
optimal  sequence. 

COROLLARY  (Theorem  3):  If  w(  ^  w*.  p,  ^  pk  and  d,  ^  dk,  then  j  precedes  A  in  an 
optimal  sequence.  The  corollary  immediately  yields  an  optimal  ranking  procedure  for  problems 
derived  by  making  constant  any  two  of  the  three  parameters.  For  example,  when  C«  J  f, 

/«.V 

with  tv,  -  tv  and  </,  -  d,  an  optimal  sequence  is  determined  by  ordering  the  jobs  by  processing 


SEQUENCING  INDEPENDENT  JOBS 


505 


requirement,  smallest  first  (p,  <  p2  .  .  .  <  p„).  When  rf-  Owe  have  T,  -  C,,  i.e.,  the  mean 
flowtime  problem,  for  which  this  sequence  is  called  shortest  processing  time  (SPT). 

The  problem  of  minimizing  the  total  tardiness  penalty  when  p,  —  p  is  also  not  difficult  to 
solve.  Constant  resource  requirements  imply  a  fixed  sequence  of  completion  times  under  any 
sequence.  In  particular  the  first  job  completes  at  tip),  the  second  job  at  /(2p),  etc.;  and  an 
optimal  schedule  may  be  found  by  assigning  jobs  to  positions,  as  in  Lawler  [10]; 

x„  =  1  if  job  I  appears  in  sequence  position  j 
=  0  otherwise 

c„  =  the  penalty  for  job  /  when  it  appears  in  sequence  position  j,  i.e.  max  [0,  t(jp)  -  d). 
The  problem  is  to  minimize  £  £  c„x„ 

i  i 

Subject  to 

XX  =  1 

i 

XX  -  i- 

I 

An  assignment  algorithm  can  produce  the  optimal  solution. 

The  most  general  version  of  the  single-machine  problem,  with  unequal  due-dates,  pro¬ 
cessing  times,  and  weights  is  binary  NP-complete.  The  computational  complexity  of  the  cases 
in  which  w,  =  w  or  d,  —  d  >  0  is  an  open  question.  However,  pseudo-polynomial  algorithms 
have  been  developed  by  Lawler  (11)  and  Lawler  and  Moore  [12],  The  algorithms  which  have 
demonstrated  the  most  effective  computational  power  for  the  problems  are  those  found  in  [14]. 
These  and  other  enumerative  algorithms  can  be  modified  in  a  straightforward  manner  to  accom¬ 
modate  the  variable  resource  problem. 

Minimizing  The  Weighted  Number  of  Tardy  Jobs 

In  this  case  we  are  interested  in  whether  a  job  is  lardy  rather  than  the  the  length  of  time 
by  which  it  is  tardy.  Let  8(7",)  =  1  indicate  that  job  ./'  is  lardy  and  8(7",)  —  0  indicate  that  it  is 
completed  on  time.  If  each  job  has  its  own  penalty  for  being  tardy,  i.e., 

G  =  5>,8(7",). 

n  .V 

then  the  single-machine  problem  is  binary  NP-complete,  although  it  can  be  solved  by  a 
pseudo-polynomial  dynamic  programming  algorithm  due  to  Lawler  and  Moore  [12].  The  algo¬ 
rithm  can  easily  be  adapted  to  the  variable-resource  problem  with  no  impact  on  computational 
efficiency. 

By  restricting  the  data  we  obtain  special  cases  that  are  solvable  by  ranking  algorithms,  just 
as  in  the  single-machine  case: 

THEOREM  4:  When  d,  -  d  for  all  jobs,  if  the  processing  times  and  weights  are  agreeable 
(p,  <  p,  whenever  tv,  ^  tv,)  then  an  optima!  sequence  is  obtained  by  scheduling  the  jobs  in 
order  of  processing  requirement,  shortest  first  (in  order  of  weight,  largest  first). 

COROLLARY  (Theorem  4):  When  d,  »  d  and  p,  “  p,  an  optimal  sequence  is  obtained  by 
scheduling  jobs  in  order  of  weight,  largest  first. 
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When  w,  -  w  for  all  jobs  a  sequence  that  minimizes  the  number  of  tardy  jobs  i.e., 
G  -  ( 7',) ,  can  be  determined  by  generalizing  an  efficient  algorithm  due  to  Moore  (131. 

;€,V 

Since  maximum  tardiness  is  minimized  by  sequencing  the  jobs  in  EDD  order,  it  follows  that  if 
sequence  S  yields  minimum  G,  then  so  will  sequence  S',  in  which  the  on-time  jobs  in  S  are 
scheduled  in  EDD  order  followed  by  all  the  tardy  jobs  in  S.  Letting  S„  represent  the  largest 
possible  set  of  on-time  jobs  (so  that  G  -  n  -  |S„|  is  the  minimum  number  of  tardy  jobs)  S„ 
can  be  determined  as  follows: 


1.  Order  and  index  the  jobs  in  N  such  that  <  </2  <  . . .  <  d„  (where  lies  are  bro¬ 
ken  arbitrarily).  Set  S0  -  0  and  Ac  -  1. 


2.  If  A:  —  n  +  1  stop.  S„  is  an  optimal  set. 


3. 


If 


'I  L  p,  +  Pk 


dk 


set  Sk 


Sk.{  V  (Ac),  otherwise  let  p. 


max 


I**-  I 

[p,\j  €  V,  U{k))  and  set  Sk  =  Sk  l  U\k]  -  (r). 
4.  Set  Ac  =  Ac  +  1  and  return  to  step  2. 

Constrained  (Secondary  Criterion)  Problems 


Several  authors  have  addressed  the  problem  of  sequencing  n  jobs  on  one  machine  so  as  to 
optimize  one  criterion  while  restricting  the  set  of  sequences  so  that  all  or  some  jobs  also  satisfy 
another.  We  include  four  such  problems  here.  In  particular. 


(a)  Minimize  total  (mean)  flow  time  given  that  a  subset  E  of  the  jobs  are  to  be  on  time 
(Burns  and  Noble  [2]  and  Emmons  (4),  i.e., 

min  G  =  £C, 

,v 

s.t.  C,  <  d„  i  €  E 


(b)  Minimize  maximum  tardiness  given  that  a  subset  E  of  the  jobs  are  to  be  on  time 
(Burns  and  Noble  (2]),  i.e., 

min  G  =  max  T, 

it  V 

s.t.  C,  <  d,.  /  €  E 


(c)  Minimize  mean  flow  time  over  all  sequences  which  yield  minimum  maximum  cost 
(Emmons  (5)  and  Heck  and  Roberts  (81),  i.e., 

min  G  =  £C, 

it  x 

s.t.  S,(C,)  ^  G„„  /  €  /V 

where  g,(C)  is  a  non-decreasing  function  of  Cand  Gm  -  min  (max  s,(C,)} 

(d)  Minimize  the  number  of  tardy  jobs  given  that  a  subset  E  of  the  jobs  is  to  be  on  time 
(Sidney  (151),  i.e., 

min  G  -  £8(f,) 

it  v 

subject  to  C,  4  d,  i  €  E. 


sequencing  independent  jobs 
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In  all  cases  the  algorithms  originally  developed  for  the  single-machine  problem  can  easily 
be  adapted  to  the  variable-resource  problen 

The  first  three  problems  can  be  solved  by  a  one  pass  algorithm  which  sequences  jobs  one 
at  a  time  from  last  to  first.  Suppose  that  jobs  have  been  assigned  to  positions  k  +  1  through  n. 
Let  Nk  be  the  set  of  jobs  as  yet  unsequenced  and  Lk  be  the  subset  of  Nk  that  can  be  assigned 
position  k  without  violating  the  constraint.  A  job  from  Lk,  say  job  j,  is  then  chosen  according 
to  a  certain  rule  and  sequenced  in  position  k.  Then  Nk-t  —  Nk  —  [7),  Lk„t  is  generated,  and  a 
job  is  sequenced  in  position  k—  1,  etc. 

Letting  Ek  =  Nk  f')  E and  p(Nk)  =  £  pk ,  then  for  problems  (a)  and  (b) 

'«'Vt 

Lk  =  ( Nk  -  Ek)  U  | A/  €  Ek-  d,  >  i(p(Nk))\ 

while  for  problem  (c) 

Lk  =  [j\j  €  Nk-  gjU(p(Nk)))  <  GJ 

The  rule  for  choosing  the  job  for  position  k  in  problems  (a)  and  (c)  is  choose  such  that 
p,  =  max  p, 

while  for  problem  (b),  j  is  chosen  such  that 
d ,  =  max  d,. 

liLk 

Problem  (d)  may  be  solved  by  modifying  the  due-dates  to  reflect  the  fact  that  if 
d,  <  dk.  k  €  £,  and  job  i  is  to  be  on  time  in  a  feasible  sequence  then  i  must  be  completed  by 
t(R(dk )  -  pk).  Then  Moore’s  algorithm  can  be  applied,  with  an  adjustment  to  assure  that  jobs 
in  E  will  be  on  time.  This  is  essentially  the  procedure  developed  by  Sidney  {15], 

Nonsimultaneous  Arrivals 

In  the  preceding  sections  all  jobs  are  assumed  to  be  available  for  sequencing  at  time  zero. 
We  now  consider  problems  in  which  job  j  is  not  available  for  processing  until  the  beginning  of 
period  rr  where  r,  >  1.  If,  in  this  situation,  it  is  possible  to  interrupt  the  processing  of  a  job 
and  resume  it  later  without  loss  of  progress  toward  completion  of  the  job.  we  say  that  the  sys¬ 
tem  operates  in  a  "prempt-resume"  mode. 

For  single-machine  problems  with  criteria  maximum  tardiness  (G  -  max  7})  or  total 
(mean)  completion  time  (G  -  £  Cj)  when  prempt-resume  applies;  the  static  optimizing  rules 

>€JV 

EDD  and  SPT  can  be  generalized  in  a  straightforward  manner  to  produce  optimal  sequences 
when  all  jobs  are  not  simultaneously  available  ([1 J  p.  82).  The  same  generalizations  apply  when 
resource  availability  varies  with  time,  using  the  following  procedure: 

1 .  At  time  zero  if  one  or  more  jobs  are  available  assign  the  resource  to  process  the 
available  job  with  the  smallest  (most  urgent)  priority.  Otherwise  leave  the 
resource  idle  until  the  first  job  is  available. 

2.  At  each  job  arrival,  compare  the  priority  of  the  newly  available  job  j  with  the 
priority  of  the  job  currently  being  processed.  If  the  priority  of  job  j  is  less,  allow 
job  j  to  preempt  the  job  being  processed;  otherwise  add  job  j  to  the  list  of  avail¬ 
able  jobs. 


* 


508  IC  R.  BAKER  AND  H.L.W.  NUTTLE 

3.  At  each  job  completion,  examine  the  set  of  available  jobs  and  assign  the  resource 
to  process  the  one  with  the  smallest  priority. 

In  order  to  minimize  maximum  tardiness,  the  priority  of  a  job  is  taken  to  be  its  due-date,  and 
to  minimize  mean  flowtime  the  priority  is  its  remaining  resource  requirement. 

3.  MINIMIZING  THE  SUM  OF  WEIGHTED  COMPLETION  TIMES 

One  case  for  which  the  single-machine  result  does  not  generalize  in  a  straightforward 
manner  to  the  corresponding  variable-resource  problem  is  the  case  sequencing  to  minimize  the 
sum  of  weighted  completion  times,  where 

G  “  Y.wict 

ns 

when  all  jobs  are  available  at  time  zero. 

Sequencing  jobs  in  nondecreasing  ordef  of  the  ratio  pj  tv,,  which  will  always  minimize  G 
in  the  single-machine  problem,  need  not  yield  an  optimal  sequence  when  the  resource  availabil¬ 
ity  varies  with  time.  The  following  simple  example  demonstrates  this  fact. 

EXAMPLE 


j 

1 

2 

3 

Pi  1 

7 

3 

6 

IV,  ; 

5 

2 

4 

Pi!*, 

1.4 

1.5 

1.5 

r(r)  =1  0  <  t  <  4 

r(i)  —  4  4  <  /  <  7  M  «  7 

Sequencing  the  jobs  in  nondecreasing  order  of  p,/tv,  yields  the  order  1-2-3,  for  which  the  com¬ 
pletion  times  are  4.75,  5.5  and  7.  Therefore,  G  -  62.75.  For  the  sequence  2-1-3  the  comple¬ 
tion  times  are  3,  5.5  and  7,  with  G  =  61.5.  (Under  the  discrete  time  framework  G  —  65  for 
1-2-3  but  G  -  64  for  2-1-3.) 

While  the  differences  in  (7-values  may  seem  almost  insignificant  it  is  possible  to  construct 
an  example  in  which  sequencing  by  increasing  ratios  p,/ w,  will  yield  an  arbitrarily  bad  solution. 
Consider  the  data  for  a  two-job  problem  in  which 


r(r) 

r(t) 


j 

1 

2 

P, 

10'" 

5  x  10Jm 

", 

p,/w, 

1 

10"' 

10m 

5  x  10'" 

0  i  <  l 

1  <  t  <  1  +  I01 

TJ 

have 


— (—  -  — (iom). 

G(S')  2 
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For  the  special  case  in  which  the  processing  times  and  weights  are  agreeable  ( p/  <  p, 
whenever  w,  >  w,)  sequencing  by  nondecreasing  ratios  of  pj  w(  does  produce  an  optimal  solu¬ 
tion  (see  Theorem  4).  Otherwise  the  two  examples  given  in  this  section  reinforce  the  notion 
that  the  single-machine  result  cannot  be  extended  to  even  the  simplest  versions  of  the 
variable-resource  model.  In  one  example  the  resource  profile  r(t)  is  nondecreasing,  while  in 
the  other  example  r(t)  is  nonincreasing.  In  both  cases  there  is  only  one  change  in  r(t).  These 
situations  would  appear  to  be  among  the  least  drastic  ways  of  relaxing  the  constant  resource 
assumption;  but,  as  we  have  demonstrated,  the  ratio  rule  still  fails.  At  this  point,  we  can  con¬ 
clude  only  that  the  minimization  of  Iw,C,  involves  more  than  a  simple  extension  of  the 
single-machine  result.  Obviously,  any  optimal  ordering  rule  (if  one  exists)  would  have  to 
involve  information  about  the  resource  profile  as  well  as  information  about  processing  require¬ 
ments  and  weights.  We  conjecture  that  this  problem  is  NP-complete. 

4.  COMMENTS 

Although  it  is  not  possible  to  extend  all  single-machine  results  directly  to  the  variable- 
resource  case,  a  few  observations  can  be  made.  A  look  at  Figure  1  indicates  that  the  graph  of 
R  (/)  transforms  processing  times  (on  the  horizontal  axis)  into  resource  consumptions  (on  the 
vertical  axis),  and  vice-versa.  This  transformation  is  at  least  order-preserving.  In  particular, 
the  makespan  for  a  set  A  of  jobs  is  at  least  as  large  as  the  makespan  for  set  B  when  the  jobs  in 
A  have  a  total  processing  requirement  that  equals  or  exceeds  the  requirement  of  the  jobs  in  B. 
This  property  is  fundamental  to  the  proof  of  many  single-machine  results  as  they  carry  over  to 
variable-resource  models.  Moreover,  the  results  for  problems  in  which  Pj  —  p  do  not  rely  on 
the  precise  nature  of  the  transformation,  but  depend  only  on  the  fact  that  all  solutions  share  a 
common  nondecreasing  sequence  of  completion  times. 

In  the  single-machine  case,  R(i)  is  linear,  implying  that  the  mapping  of  resource  con¬ 
sumptions  into  processing  times  is  proportionality-preserving  as  well  as  order-preserving.  That 
is,  ratios  of  intervals  on  the  resource  axis  convert  to  identical  ratios  on  the  time  axis.  This  pro¬ 
perty  is  not  maintained  in  the  variable-resource  model,  because  the  transformation  distorts  pro¬ 
portionality.  In  particular,  we  have  in  the  single-machine  problem  that  pj p,  ^  w,/wj  implies 
AC, /AC,  <  w,/w;,  where  AC,  and  A Cy  denote  the  magnitude  changes  in  the  completion  times 
of  adjacent  jobs  /  and  j  which  are  interchanged  in  sequence.  This  implication  does  not  hold  in 
the  variable-resource  problem,  so  the  pairwise  interchange  argument  may  fail. 

These  observations  lead  to  the  conclusion  that  single-machine  results  involving  minimum 
weighted  sum  of  completion  times  cannot  be  directly  extended.  An  open  question  is  therefore 
how  to  exploit  the  structure  of  this  problem  in  the  variable-resource  case  in  order  to  find 
optimal  solutions. 
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ABSTRACT 

A  model,  for  assessing  the  effectiveness  of  alternative  force  structures  in  an 
uncertain  future  conflict,  is  presented  and  exemplified.  The  methodology  is  ap¬ 
propriate  to  forces  (e.g.,  the  attack  submarine  force)  where  alternative  unit 
types  may  be  employed,  albeit  at  differing  effectiveness,  in  the  same  set  of  mis¬ 
sions.  Procurement  trade-offs,  and  in  particular  the  desirability  of  special  pur¬ 
pose  units  in  place  of  some  (presumably  more  expensive)  general  purpose 
units,  can  be  addressed  by  this  model.  Example  calculations  indicate  an  in¬ 
crease  in  the  effectiveness  of  a  force  composed  of  general  purpose  units,  rela¬ 
tive  to  various  mixed  forces,  with  increase  in  the  uncertainty  regarding  future 
conflicts. 

INTRODUCTION 

In  planning  the  procurement  of  major  weapons  systems  (submarines,  aircraft,  ships,  etc.), 
an  argument,  based  upon  relative  cost-effectiveness  in  certain  uses,  may  be  made  for  the 
development  and  purchase  of  some  items  which  are  less  versatile  and  effective  than  the  "best" 
available  components  of  an  overall  force.  Assuming  all  relative  costs  and  effectivenesses 
known,  such  an  argument  is  sound  at  least  to  the  extent  that  the  uses  necessitated  by  a  poten¬ 
tial  conflict  are  anticipated.  However,  under  uncertainty  about  the  nature  of  potential  conflicts, 
a  question,  in  general  more  subtle,  is  raised  regarding  the  optimal  composition  of  forces.  In 
this  case,  a  model  is  developed  here  to  analyze  the  utility  of  "mixed"  force  structures,  and 
examples  are  given  to  support  the  intuitive  notion  that  the  less  specific  are  the  presumptions 
about  needs  in  a  future  conflict,  the  more  valuable  are  the  most  versatile  forces. 

Our  focus  here  is  upon  presenting  a  model  able  to  capture  the  value,  under  uncertainty, 
of  versatile  forces  and  not  upon  the  equally  important  problem  of  determination  of  cost  and 
effectiveness  parameters.  The  latter,  as  well  as  the  mixture  versus  force  level  interaction,  are 
touched  upon  tangentially  in  an  example.  The  parameter  estimation  problem,  in  general, 
requires  both  large  scale  theoretical  and  empirical  effort  and  has  been  addressed,  in  the  subma¬ 
rine  case,  in  Reference  (1). 
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By  general  purpose  forces  we  shall  mean  the  most  versatile,  advanced  or  effective  com¬ 
ponents  which  technology  would  currently  allow  in  building  a  military  force  structure.  Special 
purpose  forces,  on  the  other  hand,  might  be  competitive  in  effectiveness  with  general  purpose, 
but  only  in  some  of  the  uses  (which  we  shall  call  missions)  which  possible  conflicts  might 
require.  Naturally,  we  presume  that  the  general  purpose  are  more  expensive  than  the  special 
purpose  forces  per  item,  and  further  that  the  special  purpose  forces  are  cost  effective,  in  some 
missions.  It  is  assumed  also  that  all  costs  are  accounted  for,  e.g.,  development,  production, 
maintenance,  operation,  repair  and  logistical  mobility,  etc. 

Examples  of  general  versus  special  purpose  forces  include  the  following.  In  the  case  of 
submarine  forces,  the  general  purpose  would  be  the  newest  fully  equipped  nuclear  submarine 
while  a  special  purpose  alternative  would  be  the  conventional  diesel  submarine  found  in  many 
European  navies.  The  former  is  presumed  at  least  as  effective  in  all  missions  (much  more  so  in 
some)  while  the  latter  is  much  less  expensive  and  nearly  as  effective  in  some  missions  requiring 
only  low  mobility.  In  the  case  of  aircraft,  a  long-range  fighter-bomber  might  be  considered  gen¬ 
eral  purpose  and  a  plane  designed  primarily  for  ground  attack  would  be  special  purpose. 

The  force  planner  must  procure  some  mixture  of  forces,  constrained,  presumably,  by  a 
fixed  budget.  In  general  there  may  be  several  force  types,  ranging  from  the  very  general  to  the 
very  special  purpose,  and  we  may  think  of  the  force  structure  as  being  a  vector  of  inventories 
of  each  type  purchased.  We  think  of  a  conflict  as  simply  a  collection  of  mission  opportunities, 
and  the  planner’s  problem  is  then  to  procure  that  force  structure  which  permits  the  most 
effective  deployment  for  a  conflict.  For  a  specified  conflict,  this  poses  a  deterministic  optimiza¬ 
tion  problem  which,  if  the  conflict  includes  enough  important  mission  opportunities  in  which 
the  special  purpose  forces  are  cost  effective,  will  surely  suggest  a  mixed  force  structure  includ¬ 
ing  at  least  some  special  purpose  units. 

However,  procurement  of  weapons  systems  must  generally  be  decided  upon  long  in 
advance  of  potential  conflicts.  For  a  variety  of  additional  reasons,  there  will  likely  be  consider¬ 
able  uncertainty  as  to  the  precise  nature  of  an  actual  conflict.  We  consider  this  uncertainty  to 
be  characterized  by  a  (known)  distribution  of  potential  conflicts,  i.e.,  a  distribution  of  mission 
opportunities.  We  note  that  there  are  other  ways  in  which  uncertainly  might  be  treated.  For 
example,  if  one's  own  force  structure  is  known,  a  hostile  adversary  might  be  expected,  to  the 
extent  that  circumstances  allow,  to  bias  a  conflict  in  a  direction  which  would  render  one's  own 
force  least  effective.  This  suggests  a  game  theoretic  approach.  Although  it  is  not  pursued 
further  here  and  although  its  information  requirements  might  be  great,  this  would  naturally  fit 
into  the  model  context  we  outline  below.  It  seems  likely  that  such  a  treatment  would  value  the 
versatility  of  general  purpose  forces  more  so  than  the  one  we  pursue.  Another  alternative 
would  be  to  treat  the  effectiveness  of  each  unit  type  as  unknown  and  characterize  it  by  a  proba¬ 
bility  distribution. 

The  planner's  problem  which  we  address  is  then  to  choose  that  affordable  mixture  of 
forces  which,  assuming  optimal  deployment  in  any  conflict,  yields  the  largest  expected 
effectiveness  in  the  uncertain  conflict.  It  should  be  noted  that,  as  stated,  there  is  an  implicit 
assumption  that  the  planner  is  willing  to  take  the  risk  that  the  solution  mixture  will  produce 
unusually  low  effectiveness  in  some  conflicts.  (This  is  in  contrast  with  the  game  theoretic 
approach  mentioned  above.)  However,  to  the  extent  that  the  planner  is  risk-adverse  rather 
than  risk-neutral,  other  criteria  may  be  substituted  for  "expected  effectiveness"  without  concep¬ 
tual  difficulty  and  probably  without  operational  difficulty  in  the  development  below.  It  should 
also  be  mentioned  that  a  measure  of  the  value  of  the  versatility  of  general  purpose  forces  under 
uncertainty  lies  in  comparing  the  solution  mixture  of  the  above  problem  to  the  optimal  mixture 
when  the  expected  conflict  is  assumed  known  (i.e.,  the  case  of  certainty).  In  general  the 
"expected  effectiveness"  solution  will  differ  from  the  "expected  conflict"  solution. 
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MODEL  DESCRIPTION 

We  imagine  n  force  types  Tn  j  =  1 . n  and  m  different  mission  categories  U, , 

/  =  1 . »»,  in  which  a  component  of  the  force  might  be  engaged.  Each  7}  is  more  or  less 

effective  in  a  given  U,  which,  to  the  extent  that  total  effectiveness  is  linear  in  the  deployment 
of  force  types  to  mission  categories,  suggests  the  definition  of  an  m-by-n  unit  effectiveness  matrix 

E=(eti), 

in  which  eu  indicates  the  effectiveness  of  a  unit  of  7}  employed  in  7/,  for  a  unit  of  conflict 
(presuming  opportunities  available).  We  denote  by  a  1-by-n  vector  s,  a  particular  force  composi¬ 
tion  in  which  s,  is  the  number  of  Tt  available.  At  the  time  of  a  conflict,  s  is  fixed  and,  there¬ 
fore,  provides  a  constraint  on  the  total  effectiveness  attainable.  A  particular  conflict  is  charac¬ 
terized  by  the  total  opportunity  for  effectiveness  which  may  be  obtained  from  each  mission  category. 
These  bounds  are  summarized  in  an  m-by-1  vector  b  in  which  b,  is  the  maximum  opportunity 
in  Uj.  This  bound  is  expressed  in  effectiveness  units  rather  than  force  units  because  the 
"opportunities"  are  opportunities  to  damage  the  opponent  and  the  force  types  vary  in  their  abil¬ 
ity  to  do  so  in  a  given  mission. 

The  m-by-n  matrix  A  =  (a„)  summarizes  the  allocation  (or  deployment)  of  7)  to  7/,,  i.e., 
a,j  is  the  amount  of  force  type  7}  allocated  to  mission  category  77,  during  a  conflict.  The  au  are 
necessarily  nonnegative  but  we  do  not  assume  them  integral  because  of  the  possibility  of 
switching  units  among  missions. 

The  problem  of  waging  a  given  conflict  is  then  to  deploy  the  given  force  so  as  to  maxim¬ 
ize  total  effectiveness  within  the  constraint  of  the  opportunities  the  conflict  presents.  In  general 
(no  linearity  assumption),  total  effectiveness  is  some  function 

e  -  e(A) 

of  the  allocation,  and,  furthermore, 

e(A)  —  e\(A)  +  ...  4-  e,„  (A ). 

where  e,(A)  is  the  effectiveness  A  yields  through  the  rth  mission  category.  This  means  that 
waging  the  known  conflict  b  amounts  to  the  optimization  problem: 

maximize  e(A) 

m 

subject  to  X  ati  ^  V 

e,(A)  ^  b„ 
atJ  >  0. 

In  case  total  effectiveness  is  linear  in  A,  we  have  the  linear  programming  problem: 

j  m  n 

ji  maximize  £  £  aIJeu 

;  ,-tj- 1 

m 

subject  to  £  a,,  <  sr  j  -  1 . n 

i- 1 

n 

L  a„e„  <  b,.  /  =  1 . m 

i- 1 

a„  >  0. 


j  =  1 . « 

i  =  1 . m 
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In  either  case  we  denote  the  maximum  achieved  by  M(s,b).  Then,  equicost  force  compositions 
s  may  be  compared,  for  a  given  conflict,  by  comparing  the  M(s,b).  A  good  general  reference 
for  relevant  concepts  in  the  linear  case  is  Reference  [2], 

Uncertainty  as  to  the  nature  of  the  conflict  is  characterized  by  a  probability  distribution  for 
b.  For  a  given  s,  there  is  an  M(s,b)  for  each  possible  value  of  b.  These  may  then  be  averaged 
according  to  the  distribution  of  b  to  obtain  the  expected  value: 

A i(s)  =  Eh(M(s.b)). 

Comparisons  among  force  compositions  may  then  be  made  by  comparing  the  Mis),  and  the 
planner's  problem  is  to 

maximize  Mis) 

subject  to  his  budget  constraint  governing  the  possible  forces  s  which  may  be  purchased.  In 
general, 

max  E„(M(s,b))  ^  max  Mis,Ebib )). 

V  5 

and  in  the  case  that  effectiveness  is  linear  in  A, 

max  EbiMis,b))  ^  max  iMis, Eh(b)). 

S  S 

Thus,  the  maximum  expected  effectiveness  problem  has  a  different  solution  from  the  problem 
of  maximum  effectiveness  is  an  expected  conflict,  so  that  uncertainty  makes  a  difference  in 
planning.  We  present  examples  which  illustrate  this,  and  in  which  the  latter  favors  special  pur¬ 
pose  forces  while  the  former  favors  general  purpose,  presumably  because  of  their  greater  ability 
to  defend  against  variation  (uncertainty).  The  suggestion  is  that  the  more  uncertainty  there  is, 
the  greater  the  value  of  general  purpose  forces. 

EXAMPLES 

We  conclude  by  giving  two  examples.  The  first  is  primarily  to  illustrate  the  evaluation 
model  and  some  of  the  remarks  made.  The  second  includes  a  more  thorough  examination  of 
the  model  and  its  assumptions  in  a  detailed  example  intended  to  be  suggestive  of  a  realistic  case 
which  motivated  this  study. 


EXAMPLE  I:  Here  we  imagine  three  force  types.  Type  7j  is  the  general  purpose,  and  T2 
and  Ty  are  different  special  purpose  forces.  There  are  also  three  mission  categories.  Type  T2  is 
cost  effective  relative  to  T\  in  mission  U\,  while  T2  is  cost  effective  relative  to  T,  in  U2.  Total 
effectiveness  is  assumed  linear  in  allocations  and  the  unit  effectiveness  matrix  is 


E  = 


1  .7  .1 

1  I  .1 

1  .1  .7 


We  consider  seven  equicost  force  compositions 


51  = 

- 

(9. 

0. 

0) 

S2  = 

(6. 

3, 

3) 

,v'  = 

(6, 

6. 

0) 

s4  = 

(6. 

0, 

6) 

s*1  = 

(5. 

4, 

4) 

sb  ~ 

(5. 

8. 

0) 

v7  - 

(5. 

0. 

8). 

EVALUATING  FORCE  STRUCTURES 


515 


Thus,  the  two  special  purpose  forces  cost  half  as  much  as  the  general  purpose  over  the  range  of 
procurement  considered.  (Actually,  the  outcome  will  not  differ  qualitatively  if  more  alterna¬ 
tives  based  upon  the  2-for-l  trade-off  are  considered.) 


There  are  six  possible  conflicts 


0 

6 

6 

12 

0 

0 

b]  = 

6 

b2  = 

6 

Z>3  = 

0 

Z>4  = 

0 

bs  = 

12 

and  bb  = 

0 

6 

0 

6 

0 

0 

12 

with  the  first  three  presumed  to  have  probability  2/9  each  and  the  last  three  probability  1/9 
each.  Thus,  the  expected  conflict  is 

4) 


b  = 


Straightforward  calculations  then  yield 
Mis')  =  9 

Mis1)  =  Mis1)  =  Mis*)  =  8.6.  and 
Mis1)  =  Misb)  =  Mis1)  =  8.47 


so  that 

max  Mis')  =  9 

is  achieved  at  s',  the  all  general  purpose  force.  On  the  other  hand. 

Mis',  b)  =  9 

while 

Mis2,  b)  =  10.2.  Mis\  b)  =  Mis*,  b)  =  10.1. 

Mis1,  b)  =  10.6.  and  Misb,  b)  =  Mis1,  b)  =  9.3. 

Thus,  a  mixed  force  s5  is  optimal  for  the  expected  conflict.  The  conclusion,  in  this  case,  is  that 
general  purpose  forces  are  overall  more  cost  efficient  under  uncertainty.  It  should  be  noted  that 
in  calculating  the  Mis'),  each  other  force  had  higher  effectiveness  than  s'  for  some  conflicts 
(but  not  overall)  and  all  were  better  than  s'  in  the  expected  conflict.  Thus,  it  is  only  the  value 
of  versatility  under  uncertainty  which  makes  s'  preferred. 

EXAMPLE  2:  This  example  is  taken  from  the  problem  of  submarine  procurement  and 
again  illustrates  the  effect  of  uncertainty  on  the  attractiveness  of  special  purpose  forces. 

For  simplicity,  we  consider  only  two  types  of  forces,  general  purpose  and  special  purpose 
units.  In  this  setting,  the  distinction  between  new  procurement  general  purpose  or  special  pur¬ 
pose  forces  might  well  be  that  between  nuclear  or  diesel-electric  propulsion.  Equipment  and 
weapons  could  be  identical,  but  the  lower  underwater  mobility  inherent  in  diesel-electric  pro¬ 
pulsion  would  limit  effective  employment  of  such  forces  to  particular  ASW  missions.  In  the 
actual  planning  process,  the  existing  force  structure  must  also  be  considered  since  in  a  future 
conflict,  presently  existing  units  might  be  restricted  to  low  vulnerability  missions  (presumably 
being  less  capable  than  new  procurement  general  purpose  units)  and  thus  constitute  additional 
categories  of  special  purpose  forces. 

The  present  example  considers  four  missions  and  measures  unit  effectiveness  in  each  mis¬ 
sion  by  a  kill  rate  defined  by: 
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Kills  (of  enemy  submarines)  per  unit 
time  by  one  on-station  U.S.  submarine 
of  type  Ti  engaged  in  mission  U, 

u  [Number  of  surviving  enemy  submarines] ' 

The  above  quantity  is  well  defined  for  important  submarine  missions,  being  independent  of 
enemy  force  size  and  the  number  of  U.S.  submarines  committed  to  £/,  over  a  substantial  range 
of  values.  For  instance,  considering  a  fixed  barrier  mission,  the  rate  of  enemy  transits  through 
the  barrier  and  thus  the  rate  of  opportunities  for  kill  would  be  proportional  to  the  number  of 
surviving  units.  Also,  U.S.  submarine  probabilities  of  detection  and  kill  given  an  opportunity 
(here  target  passage  through  the  barrier  area  assigned  to  the  submarine)  are,  at  least  initially, 
inversely  proportional  to  the  width  of  the  barrier  area  assigned.  In  this  circumstance,  etJ  is  well 
defined.  Of  course,  nonlinear  effects  are  present  and  become  significant  as  the  number  of  U.S. 
units  is  increased.  One  could  argue  that,  as  returns  diminish,  no  additional  submarines  should 
be  assigned  to  the  fixed  barrier;  this  then  determines  the  mission  opportunities,  bj.  With  units 
of  differing  capabilities,  bj  is  properly  stated  in  terms  of  effectiveness  obtained,  not  in  some 
fixed  maximum  number  of  units  employed,  since  the  onset  of  diminishing  returns  would  occur 
at  different  force  levels  for  different  unit  effectivenesses.  Finally,  variations  in  bj  (for  the  fixed 
barrier  mission)  might  arise  from  uncertainties  in  enemy  basing,  at-sea  replenishment  of  sub¬ 
marines,  desirable  barrier  locations  being  untenable  due  to  enemy  ASW,  etc. 

Similar  arguments  apply  for  the  direct  support  mission  (submarines  employed  in  the 
defense  of  surface  formations)  and  similar  conclusions  are  obtained  in  the  area  search  mission. 

It  should  be  noted  that  kill  rates  add,  and  that  the  summation 

f'rV 

L  aueu- 

1-1  y-i 

being  an  overall  rate  at  which  enemy  submarines  are  being  killed,  is  a  sensible  measure  of 
effectiveness  for  the  entire  U.S.  submarine  force.  It  is  even  plausible  that  the  differing  subma¬ 
rine  types  would  be  assigned  to  missions  so  as  to  (approximately)  maximize  this  sum.  Finally, 
to  the  extent  that  variations  in  A,  reflect  week-to-week  changes  within  a  single  conflict  (i.e.,  one 
week  large  numbers  of  forces  are  required  for  direct  support,  the  next  week  these  same  units 
are  used  in  a  barrier)  rather  than  uncertainty  as  to  some  long-term  mix  of  missions  that  will  be 
required  in  an  unspecified  conflict,  then  the  expected  value 

Eh(M(s,b)) 

can  be  interpreted  as  a  time-average  of  force  kill  rate  and  again  this  is  a  preeminently  sensible 
measure. 

It  is  the  authors'  belief  that  the  use  of  kill  rates  as  measures  of  unit  effectiveness  and  the 
linear  formulation  of  force  effectiveness,  while  necessarily  involving  some  approximation,  does 
capture  the  important  aspects  of  evaluating  alternative  submarine  force  structures.  Of  course, 
in  realistic  applications,  the  evaluation  of  effectiveness  for  alternative  forces  is  a  substantial 
effort.  Reference  [1]  documents  a  major  study  effort  which  arrives  at  such  estimates,  although 
not  expressed  as  kill  rates.  Evaluation  of  force  effectiveness  is  not  addressed  here.  Quantita¬ 
tive  inputs  to  this  second  example,  shown  in  the  following  tabulation,  are  completely  hypotheti¬ 
cal;  and,  while  of  reasonable  relative  magnitudes,  are  chosen  to  illustrate  the  theses  of  this 
paper. 
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TABLE  1. 


Unit  Effectiveness,  eu 
(Kill  rates) 

General 

Purpose 

Submarines 

Special 

Purpose 

Submarines 

Expected  Total 
Opportunity  for 
Effectiveness 

Mission  1 

1.0 

.95 

16 

Mission  2 

1.50 

.50 

16 

Mission  3 

.75 

.375 

12 

Mission  4 

.40 

,20 

Unlimited 

TABLE  2. 


Alternative  Force  Compositions,  (s|,s2) 
(Numbers  of  Units  on-station) 

General 

Special 

Purpose 

Purpose 

Submarines 

Submarines 

35 

0 

25 

10 

20 

17 

15 

24 

Unit  effectivenesses  and  force  compositions  are  stated  in  terms  of  on-station  submarines;  actual 
numbers  of  operational  units  would  be  higher  than,  and  not  necessarily  in  proportion  to,  the 
numbers  shown.  The  alternative  forces  shown  might  well  be  equal  cost  options  if  there  were 
some  fixed  cost  associated  with  deploying  any  special  purpose  submarines.  The  fourth  mission 
is  not  limited  in  the  number  of  forces  which  can  be  employed  or  the  total  effectiveness  which 
can  be  obtained.  This  might  be  thought  of  as  undirected  open-ocean  search,  which  could 
always  be  undertaken  by  any  submarine  not  otherwise  assigned. 

The  distributions  of  b,  reflecting  uncertainty,  are  represented  by  lists  of  60  sample 
vectors— each  considered  equally  likely.  The  lists  are  not  repeated  here.  Sample  vectors  were 
generated  by  Monte-Carlo  methods,  assuming  each  b  is  an  independent  truncated*  Gaussian 
random  variable  with  the  above  stated  mean  and  relative  standard  deviations  of  35%  and  60%  in 
the  two  cases  considered.  Effectiveness,  for  the  alternative  force  compositions  is  shown  in 
Table  3,  following. 

The  maximal  effectiveness  for  each  level  of  uncertainty  is  enclosed  in  dashes.  Not 
surprisingly,  the  example  values  show  a  change  in  preference,  from  a  mixed  force  to  an  all  gen¬ 
eral  putpose  force,  as  variability  in  mission  opportunities  increases.  What  is  surprising  is  that 
the  changes,  and  differences  are  so  small  overall.  This  can  be  explained  qualitatively,  and  is  a 
reflection  of  a  real  concern  in  procurement  decisions. 


'Both  high  and  low  values  were  discarded  so  as  to  preserve  the  mean  value  and  enure  that  b,  >  0. 
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TABLE  3. 


Force  Compositions,  s 

Force  Effectiveness,  Mis) 

(S|.  Sj) 

No  Uncertainty 
(mean  value 
b  used) 

Relative 
Standard 
Deviation  of 

Relative 
Standard 
Deviation  of 

each  b,  - 

35% 

each  b,  —  60% 

(35,  0) 

38.2 

37.4 

36.5 

(25,  10) 

37.9 

36.9 

35.8 

(20,  17) 

39.1 

37.6 

36.2 

(15,  24) 

37.9 

37.1 

36.0 

In  the  present  example,  the  attractiveness  of  special  purpose  units  rests  on  the  availability 
of  opportunities  in  mission  1;  i.e.,  if  bt  ^  11.4  then  forces  including  some  special  purpose 
units  are  preferred  to  an  all  general  purpose  force.  But  mission  1  is  a  substantial  (36%)  of  the 
projected  employment  of  submarines;  if  this  were  taken  away,  then  the  force  is  over-built  and 
any  alternative  composition  is  able  to  exploit  the  remaining  attractive  opportunities.  That  is,  if 
b i  —  0  then  all  force  compositions  entertained  give  about  the  same  effectiveness;  and  as  noted 
above,  if  b\  ^  11.4,  compositions  involving  special  purpose  units  are  preferred.  In  this  cir¬ 
cumstance,  i.e.,  with  the  numeric  inputs  to  this  example  calculation,  one  cannot  expect  to  see 
dramatic  changes  in  preferences  among  force  compositions,  with  explicit  consideration  of 
uncertainty. 

As  a  final  point,  we  note  the  suboptimality  of  separating  questions  of  force  composition 
from  questions  of  force  levels.  Although  this  raises  an  issue  worthy  of  further  study,  we  only 
mention  the  issue  here  by  extending  the  previous  example.  Using  exactly  the  same  unit 
effectiveness  and  mission  opportunity  values  stated  previously,  but  considering  alternative  force 
compositions  which  involve  an  additional  5  general  purpose  submarines,  one  obtains  the  follow¬ 
ing  results: 


TABLE  4. 


Force  Compositions,  s 

Force  Effectiveness,  Mis) 

No  Uncertainty 

Relative 

Relative 

<S|,  s2) 

(mean  value 

Standard 

Standard 

b  used) 

Deviation  of 
each  b  —  35% 

Deviation  of 
each  b,  *■>  60% 

(40,  0) 

42.0 

40.7 

39.6 

(30,  10) 

41.6 

40.3 

39.0 

(25.  17) 

inn 

41.0 

39.6 

(20,  24) 

41.7 

40.7 

39.7 

In  this  case,  the  uncertainty  considered  does  not  lead  to  a  preference  for  an  all  general  purpose 
force,  although  again  the  effects  are  very  small.  The  tendency  here  is  intuitively  satisfying,  i.e., 
special  purpose  units  become  more  attractive  as  overall  force  levels  are  increased,  relative  to  a 
fixed  job  to  be  done.  Notice  also  that  increased  uncertainty  decreases  the  incremental 
effectiveness  of  the  additional  five  general  purpose  units,  in  every  case. 
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ABSTRACT 

Abdel  Hameed  and  Shlmi  (1]  in  a  recent  paper  considered  a  shock  model 
with  additive  damage.  This  note  generalizes  the  work  or  Abdel  Hameed  and 
Shimi  by  showing  that  the  u-priori  restriction  to  replacement  at  a  shock  time 
made  in  [I]  is  unnecessary. 

1.  INTRODUCTION 

A  recent  paper  by  Abdel  Hameed  and  Shimi  [1]  was  concerned  with  determining  the 
optimal  replacement  time  for  a  breakdown  model  under  the  following  assumptions:  A  device  is 
subject  to  a  sequence  of  shocks  occurring  randomly  according  to  a  Poisson  process  with  parame¬ 
ter  A.  Each  shock  causes  a  random  amount  of  damage  and  these  damages  accumulate  addi- 
tively.  The  successive  shock  magnitudes  Yi,  K2 . are  positive,  independent,  identically  dis¬ 

tributed  random  variables  having  a  known  distribution  function  F(  ).  A  breakdown  can  occur 
only  at  the  occurrence  of  a  shock.  Let  8  denote  the  failure  time  of  the  device.  For  t  <  8  let 
X(t)  be  the  accumulated  damage  over  the  time  duration  [0,rj.  The  device  fails  when  the  accu¬ 
mulated  damage  X(t)  first  exceeds  Z.  That  is, 

(1)  8  -  inf{/  5?  0;  *(/)  Z), 

where  Z  is  a  random  variable,  independent  of  the  accumulated  damage  process  X ,  having  a 
known  distribution  function  G(  )  called  the  killing  distribution.  More  explicitly,  if  A'(f)  "  x 
and  a  shock  of  magnitude  y  occurs,  at  time  I,  then  the  device  fails  with  probability 

G(x  +  y)  -  G(x) 

K  '  1  -  GU) 

Upon  failure  the  device  is  immediately  replaced  by  a  new  identical  one  with  a  cost  of  c.  When 
the  device  is  replaced  before  failure,  a  smaller  replacement  cost  is  incurred.  That  cost  depends 
on  the  accumulated  damage  at  the  time  of  replacement  and  is  denoted  by  c(x).  That  is  to  say 
c(jr)  is  the  cost  of  replacement  before  failure  when  the  accumulated  damage  equals  x  It  is 
assumed  that  c(0)  “  0  and  c(x)  is  bounded  above  by  c.  Thus  there  is  an  incentive  to  attempt 
to  replace  the  device  before  failure.  The  condition  c(0)  —  0  has  to  be  interpreted  as  a  policy  of 
no  replacement  if  there  is  no  damage. 

In  their  paper  Abdel  Hameed  and  Shimi  [1]  derived  an  optimal  replacement  policy  that 
minimizes  the  expected  cost  per  unit  time  under  the  restriction  that  the  device  can  be  replaced 
only  at  shock  point  of  time. 
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In  the  present  article  we  consider  a  similar  breakdown  model  without  the  above  restriction 
made  in  til-  We  allow  a  controller  to  institute  a  replacement  at  any  stopping  time  before  failure 
time.  He  must  replace  upon  device  failure.  Throughout,  we  restrict  attention  to  replacement 
policies  for  which  cost  of  replacement  is  solely  a  function  of  the  accumulated  damage.  In  some 
shock  models,  replacement  at  a  scheduled  time  offers  potential  benefits  relative  to  replacement 
at  a  random  time.  However,  the  problem  of  scheduled  replacement  in  failure  models  with  addi¬ 
tive  damage  is  an  open  problem  and  it  is  beyond  the  scope  of  the  present  study. 

Let  T  be  the  replacement  time.  At  time  T  the  device  is  replaced  by  a  new  one  having  sta¬ 
tistical  properties  identical  with  the  original,  and  the  replacement  cycles  are  repeated 
indefinitely.  The  collection  of  all  permissible  replacement  policies  described  above  will  be 
denoted  by  M.  Our  objective  is  to  prove  that  an  optimal  policy  replaces  the  system  at  shock 
point  of  time.  Thus  the  restriction  about  the  class  of  permissible  replacement  policies  made  in 
[1]  can  be  omitted. 

The  following  will  be  standard  notation  used  throughout  the  paper:  E[Y\A\,  where  Kis  a 
random  variable  and  A  is  an  event,  refers  to  the  expectation  E  [lA  K]  —  E[Y\lA  —  1]P(<4), 
where  l4  is  the  set  characteristic  function  of  A. 

2.  THE  OPTIMAL  POLICY 


By  applying  a  standard  renewal  argument,  the  long  run  average  cost  per  unit  time  when  a 
replacement  policy  T  is  employed  can  be  expressed  as  follows 


(3) 


<Pt  = 


£IcU(T»;  T  <  81  +  Elc:  r-  81 
E[T) 


Let  *  *■  inf  >l>  T. 

Ti  M 


Clearly 

.  .  _  £lc(jr(D);  T  <  8]  +  Elc.  T  -  8] 

*  *  - EIT)  ' 

for  every  T  €  M.  and  the  optimal  replacement  policy  that  minimizes  i li  T  over  the  set  M  is  the 
one  that  maximizes 

(4)  9r‘**E[T]  + Elc  -  c(X(T)V,  T  <  8). 


By  applying  Dynkin’s  formula  (see  Theorem  5.1  and  its  Corollary  in  Dynkin  [2])  equation  (4) 
reduces  to 

(5)  #r-  £-[/ory(3r(s))rfs]  +  c. 


where 
(6)  J(x) 


c(x)  -  J* c(x+.v)  dF(y) 


The  proof  of  the  above  result  follows  a  procedure  similar  to  that  used  by  the  author  in  (Section 
2  of  (31),  and  therefore  is  omitted. 

In  what  follows  we  shall  denote  by  S  the  state  space  of  the  stochastic  process 
|Jf(/);  t  <  8). 


Let 

(7) 
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S,  -  {x  6  S;  J(x)  >  0), 

and 

(8)  S2  -  [x  €  S,  J( x)  0}. 

Let  t\,  t2,  t j . be  the  shock  points  of  time  and  define 

W-  U;  /  >1}. 

Let  L  be  the  subclass  of  replacement  policies  in  which  a  decision  can  be  taken  only  over  the  set 
W. 

We  proceed  with  the  following  result: 

THEOREM  1:  For  every  replacement  policy  Q  L,  there  exists  a  replacement  policy  T2 
€  L  such  that  0r;  >  0/y 

PROOF:  Let  T,  be  a  replacement  policy  such  that  T,  9  L. 

Let  T(S2)  be  the  hitting  time  of  the  set  S2.  That  is 

(9)  T{S2)  -  inf{ /  >  0;  X (/)  €  S2). 

(It  is  understood  that  when  the  set  in  braces  is  empty,  then  T(S2)  —  °°.) 

Let 

(10)  T  “  infir  2  T,;  /  €  W) 
and  define 

(11)  T2  —  min  { 7",  T(S2)}. 

Clearly  T2  €  L.  Next  we  show  that  0T:  >  0T[. 

Using  (5)  we  obtain 

9r2-»T{-  E  [/or^(Ar(s))rfs]  -  £-[jor'  y(-T(s))ds]  -  E  [  J^J{X{s))ds\  T2-  f] 

(12) 

-  E 

Note  that 

I.  { T2  -  t)  implies  that  |  T(S2)  Ss  f)  and  therefore  E  [  //  J(X(s))ds\  T2  -  fl  >  0 


J, 


rw  ,  J(X(s))ds;  T  >  T(S2) 
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II.  7  (Af(s))  for  T(S2)  <  s  <  T t  is  non-positive  on  the  set  [T  >  r(S2)}.  Therefore 


fT\'s  /tf  (s))rfs;  T  >  T(s2 ) 


<  0. 


Therefore,  (using  (12))  we  obtain 
V  r2  ~  ®  r,  ^  0 


as  desired. 

Recalling  that  an  optimal  replacement  policy  71*  is  the  one  that  maximizes  9  T  and  using 
Theorem  1.  it  follows  that  T*  €  L.  Hence,  the  optimal  policy  derived  by  (1]  is  the  optimal  one 
among  all  possible  replacement  policies  for  which  cost  of  replacement  is  solely  a  function  of  the 
accumulated  damage. 

Finally  it  should  be  pointed  out  that  if  the  benefits  of  scheduled  replacement  were  con¬ 
sidered,  the  conclusion  reached,  that  an  optimal  policy  replaces  the  device  at  a  shock  point  of 
time,  would  no  longer  generally  hold. 
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ABSTRACT 

Multiple  regression  analysis  of  first  term  rcenlislment  rates  over  the  period 
I%8-Id77  confirms  previous  findings  that  reenlistment  is  highly  sensitive  to 
unemployment  at  the  lime  of  reenlistment  and  shortly  after  enlistment,  almost 
four  years  earlier  Bonuses,  particularly  lump  sum  bonuses,  were  also  shown  to 
he  a  significant  determinant  of  reenlistment. 


This  note  reports  the  results  of  cross-sectional  multiple  regression  analysis  of  first  term 
Navy  reenlistment.  Equations  which  were  estimated  represent  the  completion  of  research  con¬ 
ducted  by  Cohen  and  Reedy  [1]  which  analyzed  the  sensitivity  of  first  term  reenlistment  to 
fluctuations  in  economic  conditions  at  the  time  of  reenlistment  and  about  the  time  of  enlist¬ 
ment,  considering  the  effect  of  the  latter  on  reenlistment  behavior  four  years  later.  The  princi¬ 
pal  finding  of  that  study  was  that  unemployment  rates,  both  at  the  time  of  reenlistment  and 
about  the  time  of  enlistment  four  years  earlier,  were  powerful  predictors  of  reenlistment  rates. 
By  comparison,  measures  of  private  sector  versus  military  wages  entered  in  the  same  equations 
were  generally  found  to  be  insignificant  or,  at  best,  relatively  unimportant.  That  study  did  not, 
however,  take  into  account  the  influence  of  reenlistment  bonuses  which  this  follow-up  note 
addresses. 

This  note  describes  the  results  of  regression  equations,  replicating  those  which  were  the 
basis  of  the  original  Cohen-Reedy  paper,  which  include  reenlistment  bonus  variables  to  con¬ 
sider  their  influence  upon  Navy  reenlistment  over  the  ten  year  period,  1968-1977. 

Reenlistment  rates  were  compiled  from  Navy  Military  Personnel  Statistics  ("The  Green 
Book"),  quarterly  by  rating,  separately  for  E-4's  and  E-5’s.  To  help  minimize  spurious  fluctua¬ 
tions  in  the  data,  reenlistment  rates  were  calculated  only  for  those  quarters  which  had  an  aver¬ 
age  of  at  least  10  eligibles  per  month.  In  addition,  due  to  definitional  and  mensurational  incon¬ 
sistencies,  ratings  which  include  nuclear  power  and  diver  NEC’s  were  eliminated  and  other 
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ratings  which  include  6  year  obligors  (6YO’s)  were  analyzed  separately.  The  resultant  data  base 
consisted  of  3110  observations  for  4YO  ratings,  and  787  observations  for  6YO  ratings.  Each 
observation  referred  to  a  specific  quarter,  rating  and  pay  grade,  either  E-4  or  E-5. 

Four  multiple  linear  regression  equations  were  estimated:  one  for  4YO  ratings  (including 
E-4’s  and  E-5’s);  one  for  6YO  ratings  (including  E-4’s  and  E-5’s);  one  for  4YO  E-4’s;  and  one 
for  4YO  E-5’s.  No  attempt  was  made  to  estimate  separate  equations  for  each  major  occupa¬ 
tional  category  as  was  done  in  the  previous  study.  Given  observed  variations  in  earlier  equa¬ 
tions,  collective  treatment  of  ratings  has  probably  resulted  in  depressed  R 2  statistics. 

The  dependent  variable,  RATE3,  is  the  percentage  deviation  of  the  current  quarter  reen- 
listment  rate  from  the  mean  reenlistment  rate  for  that  rating  and  pay  rate  over  the  10  years 
understudy,  1968-1977. 

RATE3  =  (Quarterly  Reenlistment  Rate  -  Mean  (10  Year)  Reenlistment  Rate) 

Mean  (10  Year)  Reenlistment  Rate 

This  specification  of  the  dependent  variable  was  adopted  to  contend  with  wide  variations  in  the 
level  of  reenlistment  rates  from  rating  to  rating.  RATE3  describes  relative  changes  in  reenlist¬ 
ment  rates. 

Independent  variables  included  in  the  equations  are  listed  and  defined  in  Table  1 . 


TABLE  1  —  Independent  Variables 

AUR . current  national  unemployment  rate 

ARAUR . average  rate  of  change  in  unemployment  (AUR)  over  the 

past  6  quarters  proceeding  the  reenlistment  decision 

AUR13 . unemployment  (AUR)  13  quarters  prior  to  the  reenlistment 

decision  (NOTE:  Virtually  uncorrelated  with  AUR.) 

RW . the  ratio  of  military  basic  pay  to  private  sector  earnings 

AWARD . bonus  award  multiple 

LS . dummy  variable  indicating  lump  sum  payment  of  bonuses 

(LS  -  1  for  1968  -  1974;  LS  -  0  for  1975  -  1977) 

ELIG . number  of  individuals  eligible  for  reenlistment 

PAYRATE . dummy  variable  indicating  rate 

(PAYRATE  «■  l  for  E  -  5’s;  PAYRATE  -  0  for  E  -  4’s) 

DRAFT . number  of  persons  drafted  (all  services)  18  quarters  prior 

to  reenlistment  decision 

WAR . dummy  variable  for  Viet  Nam  War 

(WAR  -  1  for  1968  -  1972;  WAR  -  0  for  1973  -  1977) 

QTR3 . third  quarter  seasonal  dummy 

(QTR3  -  1  for  3rd  calendar  quarter  only) 

TIME . time  variable  (TIME  *■*  Year  -  67) 
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In  the  context  of  cross-sectional  analysis,  estimated  coefficients  do  not  pertain  to  the 
impact  of  a  given  variable  over  time  for  a  specific  rating,  but  represent  the  typical  impact  of  that 
variable  over  the  entire  10  years  across  all  ratings  which  were  included  in  the  study. 

Results  of  the  estimation  procedures  are  summarized  in  Table  2. 

TABLE  2  —  Reenlistment  Equations:  Coefficients,  (t-statistics),  and  Means 


EQUATION  4YO  pYO  4YO/T-4  4YO/T-5 
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The  three  unemployment  variables,  AUR,  ARAUR  and  AUR13.  were  specified  precisely 
as  in  the  earlier  Cohen-Reedy  study.  Consistent  with  those  results,  the  significance  of  the 
unemployment  rate  variables  and  the  magnitude  of  their  apparent  effect  upon  reenlistment  are 
striking.  Taken  literally,  coefficients  in  the  4YO  equation,  for  example,  show  a  one  point 
increase  in  AUR  13  (  +  .01)  indicating  a  29  point  (  +  .29)  increase  in  RATE3.  While  it  is  real¬ 
ized  that  these  coefficients  may  overstate  the  real  influence  of  unemployment,  their  equations, 
like  those  which  they  are  replicating,  do  indicate  that  reenlistment  decisions  may  in  fact  be  sen¬ 
sitive  to  perceived  costs  of  employment  search  and  to  the  security  of  private  sector  employ¬ 
ment. 


The  first  compensation  variable,  RW,  representing  the  ratio  of  military  to  private  sector 
wages,  was  calculated  separately  for  E-4's  and  E-5*s  using  basic  pay  for  E-5's  and  E-6’s  respec¬ 
tively  as  proxies  for  next-term  earnings.  RW  was  not  a  significant  variable  in  any  of  the  four 
equations. 
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The  other  two  compensation  variables,  AWARD  and  LS,  relate  to  bonuses.  AWARD  is 
the  multiple  for  a  particular  rating  in  a  given  quarter,  ranging  from  0  tc  6.  This  multiple  is  the 
factor  which  the  Navy  applies  against  an  individual’s  monthly  pay  to  compute  the  dollar  amount 
of  his  bonus  payment.  AWARD  was  significant  in  all  three  4YO  equations.  LS  is  a  dummy 
variable  which  assumes  a  value  of  1  through  calendar  1974  during  the  period  when  lump  sum 
awards  were  paid  to  approximately  50%  of  those  individuals  who  reenlisted.  Beginning  January 
1,  1975,  a  new  policy  was  initiated  which  reduced  the  percentage  of  lump  sum  bonus  payments 
to  approximately  10%  of  those  reenlisting.  The  coefficient  of  LS  indicates  that  when  bonuses 
were  paid  in  lump  sums,  the  percentage  difference  between  actual  reenlistment  rates  and  mean 
(10  year)  reenlistment  rates  was  higher  by  .45  than  when  bonuses  were  paid  in  installments. 

The  variable  EL1G  was  included  in  the  equations  simply  to  capture  the  observed  relation¬ 
ship  between  low  numbers  of  eligibles  and  high  reenlistment  rates. 

PAYRATE  is  a  dummy  variable  which  distinguishes  between  E-4’s  and  E-5’s  (PAYRATE 
=  1).  TIME  was  included  to  capture  the  influence  of  factors  which  have  changed  steadily  over 
time  such  as  the  quality  of  life  improvements  effected  by  the  Navy  over  the  past  several  years. 

These  equations  support  the  authors’  earlier  findings,  notably  that  unemployment  rates  at 
the  time  of  the  reenlistment  decision  and  shortly  after  enlistment  are  important  determinants  of 
reenlistment  rates.  Relative  wages  continue  to  appear  unimportant.  It  appears,  however,  that 
reenlistment  bonuses  have  had  a  significant  positive  effect  on  reenlistment,  particularly  when 
those  bonuses  have  been  awarded  in  lump  sum  payments. 

Although  by  no  means  conclusive,  the  equations  summarized  in  Table  2  suggest  the  fol¬ 
lowing  management  initiatives: 

—  Experimentation  is  warranted  in  the  use  of  lump  sum  bonuses  to  mitigate  the  effects 
of  low  unemployment  rates  on  reeniistment. 

—  Opportunities  to  reenlist  might  be  timed  to  coincide  with  low  points  (periods  of  high 
unemployment)  in  the  business  cycle. 

—  AUR13  and  predicted  AUR  should  be  used  to  augment  current  information  used  for 
projecting  reenlistment  rates. 

—  Based  on  the  continued  performance  of  the  AUR  13  variable,  serious  consideration 
must  be  given  to  implementing  new  programs  designed  to  effect  enlistee  career  decision 
making  very  early  during  the  first  term  of  service. 
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